Optimized Design of Distributed Quasi-Cyclic LDPC Coded Spatial Modulation

We propose a distributed quasi-cyclic low-density parity-check (QC-LDPC) coded spatial modulation (D-QC-LDPCC-SM) scheme with source, relay and destination nodes. At the source and relay, two distinct QC-LDPC codes are used. The relay chooses partial source information bits for further encoding, and a distributed code corresponding to each selection is generated at the destination. To construct the best code, the optimal information bit selection algorithm by exhaustive search in the relay is proposed. However, the exhaustive-based search algorithm has large complexity for QC-LDPC codes with long block length. Then, we develop another low-complexity information bit selection algorithm by partial search. Moreover, the iterative decoding algorithm based on the three-layer Tanner graph is proposed at the destination to carry out joint decoding for the received signal. The recently developed polar-coded cooperative SM (PCC-SM) scheme does not adopt a better encoding method at the relay, which motivates us to compare it with the proposed D-QC-LDPCC-SM scheme. Simulations exhibit that the proposed exhaustive-based and partial-based search algorithms outperform the random selection approach by 1 and 1.2 dB, respectively. Because the proposed D-QC-LDPCC-SM system uses the optimized algorithm to select the information bits for further encoding, it outperforms the PCC-SM scheme by 3.1 dB.


Introduction
Multiple-input multiple-output (MIMO) is a technique that effectively improves system reliability because the signals are transmitted and received through multiple antennas [1]. A well-known MIMO technique is called vertical Bell Labs layered space-time (V-BLAST) [2]. Multiple antennas in V-BLAST are required to simultaneously transmit data at the same frequency, which makes the receiver suffer from high inter-channel interference. Further, performing the signal transmission needs inter-channel synchronization (IAS). Luckily, an emerging developed spatial modulation (SM) [3] can solve the above problems, which is due to the reason that SM only activates one antenna to transmit the data signals in each transmission slot. Additionally, in SM, the activated antenna index carries information to make the spectral efficiency enhanced. The transmission and detection process of SM has been discussed in many studies, such as [4,5].
Another technology that can effectively enhance the system reliability is cooperative diversity having source, relay and destination [6]. In cooperative communications, the relay overhears the source messages and utilizes cooperative techniques such as decodeand-forward [7] to relay the information. Compared with direct transmission, cooperative communications are able to obtain lower error probabilities, as presented in [8]. In [9,10], it was reported that the cooperative communication system based on two users can enhance the achievable rate of two users. Due to the theoretical achievements of cooperative communications, they have attracted substantial attention. A very important discovery is

•
The D-QC-LDPCC-SM scheme is proposed, in which two different QC-LDPC codes are separately used at the source and relay. At the relay, the information bits are selected from the decoded source information bits, and the selected bits are further encoded. By combining the two LDPC codewords generated at the source and relay, the destination constructs a channel code corresponding to each information selection. • For making the destination construct the best code, an optimal information bit selection algorithm by exhaustive search is proposed at the relay to appropriately select partial source information bits. In the exhaustive-based search algorithm, the best pattern is chosen from all selection patterns, during which we consider all source information bit sequences.
• Since all the source information bit sequences and all the selection patterns are considered, the complexity of the optimal algorithm is relatively high when the QC-LDPC codes have large block length. Based on this, another partial-based search information bit selection algorithm is proposed with partial source information bit sequences and partial selection patterns being taken into account. • At the destination, the joint iterative decoding algorithm based on the three-layer Tanner graph is proposed to effectively recover the source information by the use of the equivalent parity-check matrix.
The proposed optimized selection algorithm chooses an optimized one from the selection patterns considered in the relay, and the proposed joint decoding algorithm based on the three-layer Tanner graph performs single-step decoding by using the equivalent parity-check matrix to fully exchange extrinsic information during each iteration, which helps to significantly enhance the entire system's performance. However, for larger blocklength QC-LDPC codes, the complexity of the proposed algorithms will be increased by considering more selection patterns and a larger size-equivalent parity-check matrix.
The organization of this article is shown below. Section 2 performs the description of the proposed D-QC-LDPCC-SM scheme by selection in the relay. Section 3 proposes two optimized information bit selection algorithms. In Section 4, we describe the joint iterative decoding algorithm on the basis of the three-layer Tanner graph. Section 5 analyzes the simulation results in detail. Finally, we conclude this paper.
Notation: Bold italic lowercase and capital letters denote the vector and matrix, respectively. 0 Z × Z is the zero matrix of size Z × Z. [x] is the minimum integer no less than x, and "mod" represents the modulo operation.
[·] T is used for transpose. C denotes the complex domain. CN µ, σ 2 denotes the complex Gaussian distribution with mean µ and variance σ 2 . C K 2 K 1 is a binomial coefficient. |a|b| denotes the series concatenation of a and b. |.| denotes the number of elements in a set.

Distributed QC-LDPC Coded SM Scheme by Selection in the Relay
This section presents the D-QC-LDPCC-SM scheme by selection in the relay. We first introduce the knowledge of 5G LDPC codes. Additionally, we describe the system model of the proposed scheme.
in which Z denotes the lifting size. In Equation (1), matrix A is the Z × Z circularly-shifted identity matrix and has the following expression: Matrix A e i,j associated with A and e i,j in Equation (1) is given by: The non-negative exponent e i,j of (A) e i,j is called the shift value. All exponents e i,j construct the following exponent matrices: By the above exponent matrix, the base matrix is directly written by:

Characteristics and Encoding of 5G LDPC Codes
The 5G LDPC codes adopt the so-called Raptor-like structure, and the base matrix has the sketch depicted in Figure 1. C and G construct the core, where G is a square matrix with the number of elements 1s in the first column being three and the other columns together forming the bidiagonal structure. O (zero matrix), D and I (identity matrix) construct the extension. Moreover, C corresponds to the information bits, G corresponds to the parity bits of the core and I corresponds to the extended parity bits.
,j ̅ associated with A and e i ̅ ,j ̅ in Equation (1) is given by: 1 The non-negative exponent e i ̅ ,j ̅ of ( ) ,j ̅ is called the shift value. All exponents e i ̅ ,j ̅ con struct the following exponent matrices:

Characteristics and Encoding of 5G LDPC Codes
The 5G LDPC codes adopt the so-called Raptor-like structure, and the base matrix has the sketch depicted in Figure 1. C and G construct the core, where G is a square matrix with the number of elements 1s in the first column being three and the other column together forming the bidiagonal structure. O (zero matrix), D and I (identity matrix) con struct the extension. Moreover, C corresponds to the information bits, G corresponds to the parity bits of the core and I corresponds to the extended parity bits. Two base matrices, B 1 and B 2 , with similar structures are supported in 5G LDPC codes. In 5G LDPC codes, 16 exponent matrices are supported, of which 8 exponent ma trices correspond to 1 base matrix. The exponent matrices support all lifting sizes, and Two base matrices, B 1 and B 2 , with similar structures are supported in 5G LDPC codes. In 5G LDPC codes, 16 exponent matrices are supported, of which 8 exponent matrices correspond to 1 base matrix. The exponent matrices support all lifting sizes, and their relationship is listed in Table 1. For B 1 , the supported information length K and the code rate R are 308 < K ≤ 8448 and 1/3 ≤ R ≤ 8/9, respectively. For B 2 , 40 ≤ K ≤ 3840 and 1/5 ≤ R ≤ 2/3 are supported, respectively. To adapt variable K and R (R = K/N) in 5G (N, K) QC-LDPC codes, i.e., to adapt diverse codeword length N, the shortening and puncturing approaches are required. The steps that get N transmitted codeword bits are as follows: Step 1: Obtain the base matrix and k z (information circulant columns of the base matrix) for the given K and R.
(1) For B 1 Step 2: Select the minimal Z from Table 1 to make k z Z ≥ K. By Z, we determine matrix E = e i,j 1≤i≤m b ,1≤j≤n b from the eight exponent matrices from Table 1. For B 1 and B 2 , (m b , n b ) = (46, 68) and (42, 52), respectively.
Step 3: where p i,j = −1 for e i,j = −1, and p i,j = e i,j (mod Z) for e i,j = −1.
Step 4: By dispersing each element of P into a zero matrix or circularly-shifted identity matrix of size Z × Z, the m z Z × n z Z parity-check matrix H utilized for the encoding and decoding of (N, K) LDPC codes will be obtained.
Step 5: Receive the sequence of length k z Z by adding k z Z − K zero bits at the end of the information sequence of length K. Then, we use H to perform encoding for the obtained sequence of length k z Z to find the codeword sequence c of length n z Z. Finally, the transmitted codeword bits with length N are obtained by removing the supplemented zero bits and puncturing partial codeword bits, as exhibited in Figure 2. Table 1. Relationship between the exponent matrices and lifting size sets.

Exponent Matrices
Lifting Size Sets Sensors 2023, 23, x FOR PEER REVIEW 5 of 21 their relationship is listed in Table 1. For B 1 , the supported information length K and the code rate R are 308 < K ≤ 8448 and 1/3 ≤ R ≤ 8/9, respectively. For B 2 , 40 ≤ K ≤ 3840 and 1/5 ≤ R ≤ 2/3 are supported, respectively. To adapt variable K and R (R = K/N) in 5G (N, K) QC-LDPC codes, i.e., to adapt diverse codeword length N, the shortening and puncturing approaches are required. The steps that get N transmitted codeword bits are as follows: Step 1: Obtain the base matrix and z (information circulant columns of the base matrix) for the given K and R.
(1) For B 1 Step 2: Select the minimal Z from Table 1 to make k z Z≥K. By Z, we determine matrix from the eight exponent matrices from Table 1. For B 1 and B 2 , (m b , n b ) = (46, 68) and (42, 52), respectively.
Step 3: Step 4: By dispersing each element of P into a zero matrix or circularly-shifted identity matrix of size Z × Z, the m z Z × n z Z parity-check matrix H utilized for the encoding and decoding of (N, K) LDPC codes will be obtained.
Step 5: Receive the sequence of length z Z by adding z Z − K zero bits at the end of the information sequence of length K. Then, we use H to perform encoding for the obtained sequence of length z Z to find the codeword sequence c of length n z Z. Finally, the transmitted codeword bits with length N are obtained by removing the supplemented zero bits and puncturing partial codeword bits, as exhibited in Figure 2.

Exponent Matrices
Lifting Size Sets Block#2    Figure 3 exhibits the D-QC-LDPCC-SM system model, in which the source (S) and relay (R) adopt different 5G LDPC codes. The R uses the decode-and-forward protocol. Additionally, the S, R and destination (D) deploy N T , N T and N R antennas, respectively. Completing a whole system transmission requires two time slots.

System Model in Cooperative Communications
In time slot-1, the information bit sequence m with length K 1 at the S is given by the 5G z and Z (1) are defined as m z , n z and Z, respectively. First, m is encoded into QC-LDPC codeword sequence v (1) of length n (1) z Z (1) . Then, by removing the added zero bits and puncturing the codeword bits, the length of the N 1 transmitted codeword bit sequence v is generated, as introduced in Section 2.1. Next, v is sent to the SM mapper to generate a SM vector, and the process is illustrated in Figure 4a. Specifically, the buffer obtains v and generates multiple length l (l = log 2 (N T M)) sequences v(k 1 ), where M is the constellation size, k 1 = 1, 2, · · · , N 1 + d 1 /l with d 1 denoting the number of zero bits added at the end of v to make k 1 an integer. The bit splitter divides v(k 1 ) into two parts, v 1 (k 1 ) and v 2 (k 1 ), where v 1 (k 1 ) is composed of the first log 2 (N T ) bits, but v 2 (k 1 ) is composed of the remaining log 2 (M) bits. The antenna mapper finds v 1 (k 1 ) and maps it to the transmit antenna index a 1 (k 1 ) ∈ {1, 2, · · · , N T }. The symbol mapper takes v 2 (k 1 ) and maps it to the M-ary modulated symbol v S m 1 (k 1 , where m 1 ∈ {1, 2, · · · , M}. Then, the SM modulator gives the modulated symbol v S m 1 (k 1 to the transmit antenna index a 1 (k 1 ), and outputs the transmission vector The transmission vector v S m 1 ,a 1 (k 1 ) is separately sent to the R and D through slow Rayleigh fading channels, H S,R ∈ C N T ×N T and H S,D ∈ C N R ×N T , to generate the signal vectors y S,R (k 1 ) ∈ C N T ×1 and y S,D (k 1 ) ∈ C N R ×1 : where h denote the a 1 (k 1 )-th column of H S,R and H S,D , respectively. n S,R (k 1 ) ∈ C N T ×1 and n S,D (k 1 ) ∈ C N R ×1 denote the noise vectors. The elements of H S,R and n S,R (k 1 ) obey the distribution CN(0, 1) and CN 0, σ 2 , respectively. Additionally, H S,D and n S,D (k 1 ) are defined as H S,R and n S,R (k 1 ), respectively.
In time slot-2, the SM demapper in R is adopted to obtain the log-likelihood ratio (LLR) sequence of v, and the process is illustrated in Figure 4b. Specifically, the SM demodulator using the maximum-likelihood detection approach [21] demodulates the signal y S,R (k 1 ) to generate the LLR sequences ϕ S,R ( a 1 (k 1 )) and ϕ S,R v S m 1 (k 1 of a 1 (k 1 ) and v S m 1 (k 1 , respectively. By the bit combiner, we receive the LLR sequence ϕ S,R (v(k 1 )). After the buffer, the length N 1 LLR sequence ϕ S,R (v) of v is generated. The QC-LDPC decoder is used to yield the estimated source information m. In the R, K 2 (K 2 < K 1 ) information bits m j (j ∈ 1, 2, · · · , J = C K 2 K 1 ) are selected from m. Note that the proper selection helps the D construct an optimized distributed code, and thus designing the optimized information selection algorithms is very important. The details of the optimized algorithms will be introduced in Section 3. Through the 5G (N 2 , K 2 ) QC-LDPC 2 encoder with m j,p with length n (2) z Z (2) , where v (2) j,p has the length n (2) z Z (2) − K 2 . We send the where k 2 = 1, 2, · · · , M 2 + d 2 /l and v R m 2 (k 2 are transmitted from the a 2 (k 2 )-th antenna with m 2 ∈ {1, 2, · · · , M} and a 2 (k 2 ) ∈ {1, 2, · · · , N T }. Through slow Rayleigh fading channel H R,D ∈ C N R ×N T (whose definition is similar to H S,R in Equation (7)), vector v R m 2 ,a 2 (k 2 ) is sent to the D, which obtains the signal vector y R,D (k 2 ) ∈ C N R ×1 : where h denotes the a 2 (k 2 )-th column of H R,D , and the noise vector n R,D (k 2 ) ∈ C N R ×1 has a similar definition to n S,R (k 1 ) in Equation (7).  In time slot-2, the SM demapper in R is adopted to obtain the log-likelihood ratio (LLR) sequence of v, and the process is illustrated in Figure 4b. Specifically, the SM demodulator using the maximum-likelihood detection approach [21] demodulates the signal y S,R (k 1 ) to generate the LLR sequences φ S,R (a 1 (k 1 )) and φ S,

Relay (R)
and v m 1 S (k 1 ), respectively. By the bit combiner, we receive the LLR sequence φ S,R (v(k 1 )).
After the buffer, the length N1 LLR sequence φ S,R (v) of v is generated. The QC-LDPC decoder is used to yield the estimated source information m ̅ . In the R, K 2 (K 2 < K 1 ) information bits m j (j ∈ {1, 2,⋯, J = C K 1 K 2 }) are selected from m ̅ . Note that the proper selection helps the D construct an optimized distributed code, and thus designing the optimized information selection algorithms is very important. The details of the optimized algorithms will be introduced in Section 3. Through the 5G (N2, K2) QC-LDPC2 encoder with During the respective time slot, the SM demapper in D is used for demodulating the signal vectors y S,D (k 1 ) and y R,D (k 2 ) to receive the LLR sequences ϕ S,D (v) and ϕ R,D v j,p corresponding to v and v j,p , respectively. Through the multiplexer, ϕ S,D (v) and ϕ R,D v j,p are merged into the length N (N = N 1 + M 2 ) LLR sequence ϕ 0,N corresponding to the transmitted codeword sequence |v|v j,p . Since |v|v j,p are the transmitted bits of distributed LDPC z Z (2) − K 2 , we need to complete the initial LLRs of punctured and shortened bits for ϕ 0 to make the joint LDPC decoder find the LLR sequence of length = N. The specific contents of joint LDPC decoding are described in Section 4.
corresponding to the transmitted codeword sequence |v|v j,p | . Since |v|v j,p | are the transmitted bits of distributed LDPC codeword |v (1) |v j,p (2) | with length N ̿ = n z (1) Z (1) + n z (2) Z (2) − K 2 , we need to complete the initial LLRs of punctured and shortened bits for φ ̅ 0 to make the joint LDPC decoder find the LLR sequence of length N ̿ . The specific contents of joint LDPC decoding are described in Section 4.

Optimized Information Bit Selection Algorithms at the Relay
In R, we select K2 bits m j from m with length K1 as the input of the (N2, K2) QC-LDPC2 encoder, where j is the selection order denoted as follows: Note that each j corresponds to a K2-dimensional vector, i.e., where vector ψ j is called the selection pattern, and element w i (j) represents the position of the selected bit in m. All selection patterns form the following set, mathematically expressed as: For the j-th selection, the distributed code at D is C D (j) (N ̅ , K 1 ) = {|v|v j,p |}. Assume that the input sequence of the (N2, K2) QC-LDPC2 encoder is independent of m, the D generates the distributed code C D (N ̅ , To obtain an optimized code C D (j) (N ̅ , K 1 ) at D, two optimized information selection algorithms called the optimal algorithm and the low-complexity algorithm are proposed to appropriately select the partial information from m ̅ . The following optimized selection algorithm description is based on the assumption of correct decoding (i.e., m ̅ = m) at R.

Exhaustive-Based Search Optimal Information Bit Selection Algorithm
In the optimal algorithm based on the exhaustive search, we determine the best pattern resulting in the optimal code with the best codeword weight distribution from all J selection patterns, during which all 2 K 1 source information bit sequences are considered. The specific steps of the optimal algorithm are listed in Algorithm 1.

Optimized Information Bit Selection Algorithms at the Relay
In R, we select K 2 bits m j from m with length K 1 as the input of the (N 2 , K 2 ) QC-LDPC 2 encoder, where j is the selection order denoted as follows: Note that each j corresponds to a K 2 -dimensional vector, i.e., where vector ψ j is called the selection pattern, and element w (j) i represents the position of the selected bit in m. All selection patterns form the following set, mathematically expressed as: For the j-th selection, the distributed code at D is C (j) D (N, K 1 ) = {|v|v j,p |}. Assume that the input sequence of the (N 2 , K 2 ) QC-LDPC 2 encoder is independent of m, the D generates To obtain an optimized code C (j) D (N, K 1 ) at D, two optimized information selection algorithms called the optimal algorithm and the low-complexity algorithm are proposed to appropriately select the partial information from m. The following optimized selection algorithm description is based on the assumption of correct decoding (i.e., m = m) at R.

Exhaustive-Based Search Optimal Information Bit Selection Algorithm
In the optimal algorithm based on the exhaustive search, we determine the best pattern resulting in the optimal code with the best codeword weight distribution from all J selection patterns, during which all 2 K 1 source information bit sequences are considered. The specific steps of the optimal algorithm are listed in Algorithm 1.
(2) Take into account all possible codeword weights wt(|v|v j,p |) ≤ N resulted by 2 K 1 source information sequences, where j ∈ ε.
(4) Determine the number M (j) w of codewords with weight wt(|v|v j,p |) = w for each j ∈ ε t and ψ j ∈ φ t . Find the selection orders j causing min j ∈ ε t M (j) w to form a set ε t+1 ⊆ ε t , and then determine the set φ t+1 ⊆ φ t corresponding to ε t+1 . (5) If w < N and |φ t+1 | = 1, increase the parameters t and w by 1, respectively. Then, go to step 4. Otherwise, we terminate the overall search algorithm and obtain the best selection pattern ψ (1) = ψ j .

Partial-Based Search Low-Complexity Information Bit Selection Algorithm
For the case of larger block length code, Algorithm 1 possesses high computational complexity. Based on this, we propose the low-complexity selection algorithm by partial search, different from Algorithm 1, i.e., partial patterns of J selection patterns are considered, during which we only consider partial sequences of 2 K 1 source information sequences. Algorithm 2 lists the specific design steps.

Algorithm 2: Low-complexity Algorithm.
(1) Determine the set φ of Q (Q < J) selection patterns, and then get the selection order set ε corresponding to φ: (a) Divide the length K 1 source information bit sequence m into almost equal two parts, i.e., one part has L = [(K 1 + 1)/2] bits and the remaining part has K 1 − L bits. Therefore, there are two cases. For the first case, the number of bits in the first and second parts is L and K 1 − L, respectively. For the second case, the first and second parts have K 1 − L and L bits, respectively. (b) We select K 2 bits from the two parts in each case. For the first case, we randomly choose more bits (i.e., T bits) from the first part and fixedly select K 2 − T bits from the second part, where ([k 2 + 1)/2] ≤ T ≤ min(K 2 , L). For the second case, we randomly select T bits from the second part, and fixedly select K 2 − T bits from the first part. Through the selection method, get Q selection patterns to construct the set φ = {ψ 2 , ψ 1 , · · · , ψ Q }. (c) Get ε = {1, 2, · · · , Q} corresponding to φ.
(2) Consider the possible codeword weights wt(|v|v j,p |) ≤ N (j ∈ ε) yielded by K b (K b < 2 K 1 ) source information sequences m. The K b information sequences m are obtained by the following method: (a) Split the length K 1 source information sequence m into two parts, as shown in substep (a) of step 1.
positions from the two parts to put non-zero bits, and the remaining K 1 − positions are put zero bits, where 1 is the singleton boundary and is related to the minimal codeword weight d min at the S, i.e., d min ≤ N 1 − K 1 + 1. The specific selection ways of positions are similar to substep (b) in step 1. Through the selection method, we get K b sequences m with weight 0 < wt(m) ≤ N 1 − K 1 + 1 (i.e., 0 < wt(m) ≤ d min and d min < wt(m) ≤ N 1 − K 1 + 1). Since the obtained sequences m easily generate the codewords of low weight at the S such that the destination gets low-weight codewords, it is extremely important to consider these sequences m.

Complexity Analysis of the Two Optimized Selection Algorithms
This section calculates the encoding complexity of the two optimized algorithms about addition and multiplication.
In Algorithm 1, the complexity of the 5G (N 1 , K 1 ) QC-LDPC 1 encoder with z Z (1) − 1 for encoding one information sequence. Then, for 2 K 1 information sequences, the overall encoding com- In R, for one selection pattern, the computational complexity required by the 5G (N 2 , K 2 ) QC-LDPC 2 encoder to complete the encoding of 2 K 1 information sequences is During determining ψ (1) , assume that the number of the considered selection patterns is separately B w opt . Then, the overall encoding complexity at R is w opt ). Thus, Algorithm 1 has the total complexity denoted as: In Algorithm 2, the total computational complexity is: where B (j) w low , separately, are the number of considered selection patterns for finding M (j) w opt ) during determining ψ (2) . From Equations (14) and (15), the complexity of the two algorithms is obtained.

Steps of Joint Iterative Decoding Algorithm
Joint iterative decoding is another appealing feature of the proposed D-QC-LDPCC-SM scheme. If the first K 2 bits are selected from the K 1 information bits, the equivalent parity-check matrix used for decoding at D is expressed as: where H 1 is the parity-check matrix of the 5G (N 1 , K 1 ) QC-LDPC 1 code, and H 2 = [Q, F] is the parity-check matrix of the 5G (N 2 , K 2 ) QC-LDPC 2 code at R with Q and F separately having the sizes m z Z (2) × (n (2) z Z (2) − K 2 ). If the selected K 2 bits are not in the first K 2 positions, then matrix Q is not necessarily in the first K 2 columns. For the convenience of discussion, we directly use the parity-check matrix H 0 in Equation (16) to analyze the decoding process, and the corresponding threelayer Tanner graph corresponding to H 0 is shown in Figure 5. In the three-layer Tanner graph, the check node set {c z Z (1) } related to H 1 forms the first layer. The variable node set {v n 1 , n 1 = 1, 2, · · · , n (1) z Z (1) } related to H 1 , and the variable node set {v n 2 , n 2 = 1, 2, · · · , K 2 , n (1) z Z (2) − K 2 } related to H 2 constitute the second layer of the Tanner graph, where {v n , n = 1, 2, · · · , K 2 } is the common variable node set. The check node set {c (2) m 2 , m 2 = 1, 2, · · · , m z Z (2) } related to H 2 constitutes the third layer of the Tanner graph. All check nodes connected with v n (n = 1, 2, · · · , = N) constitute the set C(v n ). All variable nodes associated with c Step 1: Initialize the LLRs of the punctured and shortened bits for ϕ 0 to make the decoder find the LLR sequence of length = N, i.e., ϕ 0 = [ϕ 0,1 , ϕ 0,2 , · · · , ϕ 0, where parts 1, 2 and 3 (related to 5G (N 1 , K 1 ) QC-LDPC 1 codes) denote the initial LLRs of 2Z (1) punctured information bits, k z ) shortened zero bits and (m (1) z − 2)Z (1) − N 1 + K 1 punctured parity-check bits, respectively. However, parts 4 and 5 separately denote the initial LLRs of k Step 3: The extrinsic information L(Q In (18) and (19), C(v n )\c (i) m i is the set composed of elements other than c (i) Step 4: Repeat steps 2 and 3. When the maximum number I max of iterations is reached, the LLR L(c n ) of the n-th codeword bit c n and the estimate of c n are exhibited as follows: Finally, obtain the estimated information sequencem = [ĉ 1 ,ĉ 2 , · · · ,ĉ K 1 ].

Computational Complexity of Joint Iterative Decoding Algorithm
This subsection considers the computational complexity of the proposed joint iterative decoding algorithm for addition and multiplication. In the parity-check matrix H 0 , let the number of 1 s in the mth row be d m , and the number of 1 s in the n-th column be ρ n , where

N}.
First, we consider the complexity of the joint decoding based on the BP algorithm. In step 2, finding the extrinsic information L(r (1) m 1 ,n ) and L(r (2) m 2 ,n ) separately needs 2d k (k ∈ Λ 1 n ) and 2d m (1) z Z (1) +l (l ∈ Λ 2 n ) elementary operations [22], where Λ 1 n and Λ 2 n represent the index sets of the check nodes in the first and third layers connected to the variable node v n , respectively. Note that |Λ 1 n |+|Λ 2 n | = ρ n . In step 3, the number of elementary operations required to find the extrinsic information L(Q (i) m i ,n ) is ρ n − 1. By combining steps 2 and 3, the complexity required to complete one iteration is computed as: Since the joint decoding algorithm terminates in the predetermined maximum iteration number I max , the overall computational complexity of the joint BP decoding algorithm is: Now, the computational complexity of the joint decoding by the MS algorithm is considered. [22] are required to obtain L(r (1) m 1 ,n ) and L(r (2) m 2 ,n ), respectively. Because the MS algorithm is only different from the BP algorithm in step 2, the corresponding complexity required for one iteration can be directly written as: Since the algorithm terminates in the I max -th iteration, the overall computational complexity of the joint MS decoding algorithm is represented as: Based on Equations (23) and (25)   First, we consider the complexity of the joint decoding based on the BP algorithm step 2, finding the extrinsic information L(r m 1 ,n (1) ) and L(r m 2 ,n (2) ) separately needs 2d k ( 1 ) and 2d m z

Computational Complexity of Joint Iterative Decoding Algorithm
(1) +l (l ∈ 2 ) elementary operations [22], where n 1 and n 2 represent index sets of the check nodes in the first and third layers connected to the variable no v n , respectively. Note that | n 1 | + | n 2 | = n . In step 3, the number of elementary operatio

Simulation Results
The BER performance of the proposed and reference systems over a slow Rayleigh fading channel is discussed. In coded cooperative communications, the signal-to-noise ratio (SNR) of the S-D, S-R and R-D links is represented by λ S,D , λ S,R and λ R,D , respectively. If λ S,R = ∞, the S-R link is ideal, otherwise it is non-ideal. Compared with S, R is closer to D, so let R have 1 dB SNR gain, i.e., λ R,D = λ S,D + 1. The parameters N T = 8 and 16-QAM are used in each simulation. All simulations are reported based on the relationship between λ S,D and BER. Table 2 lists the simulation parameters.

Comparisons under Different Information Selection Algorithms
To explain the advantages of the proposed optimal and low-complexity information selection algorithms (i.e., Algorithms 1 and 2) over the random selection method, Figure 6 depicts the BER performance of the proposed scheme (λ S,R = ∞) with different selection approaches. The corresponding selection patterns are shown in Table 3. The joint BP iterative decoding algorithm is utilized to recover the message. From simulated results, it is noticed that compared with the system utilizing the random selection method, the scheme utilizing the two optimized algorithms provides better performance, which reflects the superiority of our proposed optimized algorithms. For example, at BER = 3 × 10 −5 , Algorithms 1 and 2 outperform the random method by 1 and 1.2 dB, respectively. The reason why the random selection approach provides less performance than the proposed Algorithms 1 and 2 is that it generates the code with a smaller minimum distance of 3 at D over the others generating the same minimum distance of 6. Moreover, the two optimized algorithms can obtain the approximate performance because they can achieve the code with nearly the same number of codewords with the minimum weight of 6 at D. Further, the complexity of Algorithm 2 is greatly reduced over Algorithm 1 by Equations (14) and (15). This reveals the design rationality of low-complexity Algorithm 2. Thus, for the case of the long block length codes, we only concentrate on the impact of Algorithm 2 on the system performance, as exhibited in Figures 7 and 8. Table 3 lists the optimized and random patterns. In the simulations, S,R = ∞ and the joint BP iterative decoding algorithm are assumed. Due to the fact that the random selection method generates a smaller average minimum distance, it again shows the performance degradation of the random approach over Algorithm 2. For example, in Figure 7, the proposed scheme under the random method is about 1.2 dB worse than that under Algorithm 2 at BER ≈ 10 −5 . In Figure 8, the proposed scheme using the random approach has a 1 dB loss compared to the proposed scheme using Algorithm 2 at BER = 10 −5 . Moreover, the two optimized algorithms can obtain the approximate performance because they can achieve the code with nearly the same number of codewords with the minimum weight of 6 at D. Further, the complexity of Algorithm 2 is greatly reduced over Algorithm 1 by Equations (14) and (15). This reveals the design rationality of low-complexity Algorithm 2. Thus, for the case of the long block length codes, we only concentrate on the impact of Algorithm 2 on the system performance, as exhibited in Figures 7 and 8. Table 3 lists the optimized and random patterns. In the simulations, λ S,R = ∞ and the joint BP iterative decoding algorithm are assumed. Due to the fact that the random selection method generates a smaller average minimum distance, it again shows the performance degradation of the random approach over Algorithm 2. For example, in Figure 7, the proposed scheme under the random method is about 1.2 dB worse than that under Algorithm 2 at BER ≈ 10 −5 . In Figure 8, the proposed scheme using the random approach has a 1 dB loss compared to the proposed scheme using Algorithm 2 at BER = 10 −5 . Table 3 lists the optimized and random patterns. In the simulations, S,R = ∞ and the joint BP iterative decoding algorithm are assumed. Due to the fact that the random selection method generates a smaller average minimum distance, it again shows the performance degradation of the random approach over Algorithm 2. For example, in Figure 7, the proposed scheme under the random method is about 1.2 dB worse than that under Algorithm 2 at BER ≈ 10 −5 . In Figure 8, the proposed scheme using the random approach has a 1 dB loss compared to the proposed scheme using Algorithm 2 at BER = 10 −5 .

Performance of the Proposed Scheme and Non-Cooperative System
To observe the influence of the practical non-ideal S-R channel condition (λ S,R = ∞) on the system performance, we carry out the performance comparison for the proposed system in the ideal and non-ideal S-R channels, as shown in Figures 9 and 10. The joint BP decoding algorithm is utilized. It is noticed that the performance under the non-ideal S-R channel is very close to that under the ideal S-R channel. For example, the non-ideal cases in Figures 9 and 10 only have about 0.1 and 0.15 dB losses over the corresponding ideal cases. The results reflect the effectiveness of the proposed system in the realistic wireless channel link. Therefore, studying the proposed D-QC-LDPCC-SM scheme has important theoretical and practical significance. Table 3. Selection patterns generated in different approaches.

Comparisons between the Proposed Scheme and the Existing System
In order to better illustrate the effectiveness of the proposed coded cooperative scheme, we compare the proposed scheme with the existing polar-coded cooperative SM (PCC-SM) system [21]. The same conditions, such as S,R = ∞, NT = 8 and 16-QAM, are adopted. From Figure 12, we find that the proposed system is superior to the existing PCC-SM system under the same NR. For example, compared with the PCC-SM system with NR = 6, the proposed scheme with NR = 6 obtains about 3.1 dB SNR gain at BER ≈ 2 × 10 −4 . The reasons behind this attractive gain can be attributed to two reasons: (1) The D-QC-LDPCC-SM scheme uses the joint BP decoding algorithm, while the existing PCC-SM system uses successive cancellation decoding without iterations. (2) The proposed scheme utilizes an optimized information selection algorithm, but the existing system selects the subchannel capacities on the basis of the heuristic rather than the optimized method.

Comparisons between the Proposed Scheme and the Existing System
In order to better illustrate the effectiveness of the proposed coded cooperative scheme, we compare the proposed scheme with the existing polar-coded cooperative SM (PCC-SM) system [21]. The same conditions, such as λ S,R = ∞, N T = 8 and 16-QAM, are adopted. From Figure 12, we find that the proposed system is superior to the existing PCC-SM system under the same N R . For example, compared with the PCC-SM system with N R = 6, the proposed scheme with N R = 6 obtains about 3.1 dB SNR gain at BER ≈ 2 × 10 −4 . The reasons behind this attractive gain can be attributed to two reasons: (1) The D-QC-LDPCC-SM scheme uses the joint BP decoding algorithm, while the existing PCC-SM system uses successive cancellation decoding without iterations. (2) The proposed scheme utilizes an optimized information selection algorithm, but the existing system selects the subchannel capacities on the basis of the heuristic rather than the optimized method.
2 × 10 −4 . The reasons behind this attractive gain can be attributed to two reasons: (1) The D-QC-LDPCC-SM scheme uses the joint BP decoding algorithm, while the existing PCC-SM system uses successive cancellation decoding without iterations. (2) The proposed scheme utilizes an optimized information selection algorithm, but the existing system selects the subchannel capacities on the basis of the heuristic rather than the optimized method.   Figures 13 and 14 describe the error performance of the proposed system (λ S,R = ∞) using the joint BP decoding algorithm with different receiving antenna numbers N R . As depicted in the simulated results, the increase in N R greatly improves the BER performance. For example, in Figure 13, the BER performance is 1.3 × 10 −2 with N R = 3 at SNR = 11 dB. For N R = 4, 5 and 6, the BER performance under the same SNR is 7.7 × 10 −4 , 1.0 × 10 −4 and 9.8 × 10 −6 , respectively. In Figure 14, the BER performance of the proposed system under N R = 3, 4, 5 and 6 is 5.7 × 10 −3 , 4.2 × 10 −4 , 3.3 × 10 −5 and 4.3 × 10 −6 at SNR = 11 dB. This phenomenon shows that the configuration of more receiving antennas provides more diversity gain for the whole cooperative system, thus enhancing the error performance.

Performance of the Proposed Scheme under Various Receive Antenna Numbers and Different Decoding Algorithms
Figures 13 and 14 describe the error performance of the proposed system ( S,R = ∞) using the joint BP decoding algorithm with different receiving antenna numbers NR. As depicted in the simulated results, the increase in NR greatly improves the BER performance. For example, in Figure 13, the BER performance is 1.3 × 10 −2 with N R = 3 at SNR = 11 dB. For N R = 4, 5 and 6, the BER performance under the same SNR is 7.7 × 10 −4 , 1.0 × 10 −4 and 9.8 × 10 −6 , respectively. In Figure 14, the BER performance of the proposed system under N R = 3, 4, 5 and 6 is 5.7 × 10 −3 , 4.2 × 10 −4 , 3.3 × 10 −5 and 4.3 × 10 −6 at SNR = 11 dB. This phenomenon shows that the configuration of more receiving antennas provides more diversity gain for the whole cooperative system, thus enhancing the error performance.     In addition, Figures 15 and 16 compare the system performance (λ S,R = ∞) under the joint BP and MS decoding algorithms, where the MS decoding algorithm is the simplified BP-based decoding algorithm. It can be found that the performance of the joint MS decoding algorithm is worse than that of the joint BP decoding algorithm. For example, in Figure 15, for the case of N R = 6, the MS decoding algorithm lags behind the BP decoding algorithm by about 0.5 dB at BER = 2.5 × 10 −5 . From Figure 16, it is seen that at BER = 10 −5 , the MS decoding algorithm lags behind the BP decoding algorithm by about 0.9 dB for N R = 5. The performance loss of the joint MS decoding algorithm is mainly caused by the significant reduction in complexity.

Sensors 2023, 23, x FOR PEER REVIEW 19 of 21
In addition, Figures 15 and 16 compare the system performance ( S,R = ∞) under the joint BP and MS decoding algorithms, where the MS decoding algorithm is the simplified BP-based decoding algorithm. It can be found that the performance of the joint MS decoding algorithm is worse than that of the joint BP decoding algorithm. For example, in Figure  15, for the case of NR = 6, the MS decoding algorithm lags behind the BP decoding algorithm by about 0.5 dB at BER = 2.5 × 10 −5 . From Figure 16, it is seen that at BER = 10 −5 , the MS decoding algorithm lags behind the BP decoding algorithm by about 0.9 dB for NR = 5. The performance loss of the joint MS decoding algorithm is mainly caused by the significant reduction in complexity.

Conclusions
A novel D-QC-LDPCC-SM scheme is proposed. By adopting the optimized information selection in the relay, the destination generates the optimized code. Compared to the random selection method, the exhaustive-based and partial-based search information

Conclusions
A novel D-QC-LDPCC-SM scheme is proposed. By adopting the optimized information selection in the relay, the destination generates the optimized code. Compared to the random selection method, the exhaustive-based and partial-based search information bit selection algorithms separately obtain gains of 1 and 1.2 dB due to the obtained larger minimum distance in the destination. In our proposed coded cooperative scheme, the relay-to-destination link has a larger SNR gain than the source-to-destination link, which makes the proposed system exhibit a 0.9 dB gain over the non-cooperative counterpart. Additionally, by using the proper selection and joint iterative decoding algorithm, our proposed scheme outperforms the existing PCC-SM scheme by 3.1 dB.
Author Contributions: C.Z. conceived the idea. She developed the mathematical models and performed the Monte Carlo simulations. F.Y. checked the mathematical model and the simulated results. D.K.W., C.C. and H.X. revised the manuscript. All authors have read and agreed to the published version of the manuscript.