Precoding for RIS-Assisted Multi-User MIMO-DQSM Transmission Systems

: This paper presents two precoding techniques for a reconﬁgurable intelligent surface (RIS)-assisted multi-user (MU) multiple-input multiple-output (MIMO) double quadrature spatial modulation (DQSM) downlink transmission system. Instead of being applied at the remote RIS, the phase shift vector is applied at the base station (BS) by using a double precoding stage. Results show that the proposed RIS-MU-MIMO-DQSM system has gains of up to 17 dB in terms of bit error rate (BER) and a reduction in detection complexity of 51% when compared with the conventional MU-MIMO system based on quadrature amplitude modulation (QAM). Compared with a similar system based on amplify and forward (AF) relay-assisted technique, the proposed system has a gain of up to 18 dB in terms of BER under the same conditions and parameters.


Introduction
Massive multiple-input multiple-output (MIMO) in combination with leading-edge technologies, methodologies, and architectures are poised to be a cornerstone technology for the next-generation wireless systems and networks. The capabilities and performance of future massive MIMO systems will be enhanced through the incorporation of reconfigurable intelligent surface (RIS), artificial intelligence (AI), Terahertz (THz) communications, and cell-free architectures [1]. In particular, RIS-assisted systems have been recently proposed as one of the key enabling technologies for beyond 5G/6G wireless communication networks to support a massive number of users at a high data rate, low latency, and secure transmissions with both spectral and energy efficiency [2][3][4]. A RIS is a surface of electromagnetic metamaterial that can control the phase, amplitude, frequency, and/or polarisation of the impinging signals in a nearly passive way without the need for radio-frequency (RF) operations [5,6]. For example, the RIS can be designed to coherently combine the signals in the reception. In this way, the RIS changes the destructive effect of the multipath fading channel into a controllable channel that exploits diversity gains to improve the performance of the system. As a result, the overall energy of the transmitted signals can be used more efficiently. Since the RIS does not amplify the signals, RF chains are not required and thermal noise is not added during reflections [7][8][9]. RIS-assisted systems can be further improved if intelligent omni-surface (IOS) are considered. IOS-based systems can achieve full-dimensional wireless communications by enabling the simultaneous reflection and

System Model
The RIS-MU-DQSM system model is presented in Figure 1. We assume that there is not a direct link between the transmitter (Tx) and receivers (Rx), which can be due to unfavorable propagation conditions or blockings by obstacles [24,25]. All communication passes through the RIS. The BS is equipped with N t Tx antennas, while the desti-nation is composed of K users or MS, each one equipped with N r Rx antennas. The RIS uses N s reflective surfaces. Thus, the end-to-end configuration can be considered as a (K × N r ) × N s × N t RIS-assisted MU-MIMO-DQSM transmission system, hereinafter referred to as RIS-MU-DQSM.
The channel matrix between the BS and the RIS is defined as G ∈ C N s ×N t . The channel matrix between the RIS and the k-th MS is defined as H k ∈ C N r ×N s . The channel model is defined for two scenarios. First, a quasi-static Rayleigh fading channel is considered where its elements are assumed to be independent and identically distributed (i.i.d.) complex Gaussian random variables with mean zero and variance one, CN (0, 1). Additionally, a more realistic channel model where the spatial correlation between the RIS mirrors is considered. We assume that perfect CSI is available at the BS and all MS or users. It is worth noting that obtaining CSI can be a challenging task [26]. As has been proposed in previous works, this task can be carried out using control links between the BS and the users when the line-of-sight (LoS) path is absent in this channel [27,28]. However, the design of CSI dissemination strategies is beyond the scope of this work.

Transmission
In order to generate the DQSM signal, the sequence of input bits a k = {d n } m n=1 , with d n ∈ {0, 1} is split in two flows. Each data flow is composed of a sequence b k = {d n } m/2 n=1 , k = {1, 2}, which is fed into a quadrature spatial modulation (QSM) block. Figure 2 shows the DQSM modulation block used for the proposed scheme. The QSM signals are denoted by x i ∈ C L×1 , i = {1, 2}, with components x l ∈ {0, s , s } where s and s represent the real and the imaginary parts of the quadrature amplitude modulation of M-th order (M-QAM) symbol s, respectively. The output vector of the i-th QSM block x i is defined as where the index inx i denotes the transmitted signal in the k-th position. In order to generate the QSM vector x i , the input bit sequence is divided into three streams. One stream is used to modulate a M-QAM signal and the other two streams (spatial bits) are used to modulate the position in the QSM output vector. For an input bit sequence of length m QSM = m DQSM /2, the first log 2 (M) bits modulate a M-QAM symbol, the remaining 2 log 2 (L) = m QSM − log 2 (M) bits are divided into two streams with log 2 (L) spatial bits each. These spatial bits modulate the position of the M-QAM symbol in the output vector x i ∈ C L×1 using an SM block as follows: the real part of the M-QAM symbol is assigned to a specific position in the output vector, while the remaining L − 1 positions are set to zero. The imaginary part of the M-QAM symbol is assigned to another or even the same position in the output vector. Finally, these two signals are combined to obtain the QSM output vector x i [29]. The number of bits that can be transmitted using QSM is m QSM = log 2 (M) + 2log 2 (L), The two QSM signals x 1 and x 2 are then weighted by the factors B 1 and B 2 , which guarantee the maximal Euclidean distance between symbols. These two signals are combined to generate the DQSM output vectorx k intended for the k-th user as [19]x (2) The output vector of the k-th DQSM blockx k ∈ A can be written as where the index inx l denotes the transmitted signal in the l-th position. The complete transmission block for K users in the system before precoding is composed as where the first j users are the users under multiuser interference (MUI). The other K − j users are the users free from MUI. The number of bits that can be transmitted to each user using DQSM is m DQSM = 2(log 2 (M) + 2log 2 (L)).
DQSM is mainly used because it has good BER performance and low detection complexity [19].
As shown in Figure 1, the precoder is composed of two stages. Users under MUI are first taken to Precoder 1. The Precoder 1 is intended to pre-cancel the MUI in the second hop (from the RIS to the MS). The output of Precoder 1 is the vector x 1 defined as In Precoder 1, the vectorsx k are precoded by the matrices W k ∈ C N s ×N r , where N s ≤ N s is the number of reflecting mirrors intended for users under MUI and N r ≤ N r is the number of receive antennas in this subset. Users who are not under MUI, are concatenated in the vector x 2 and then are taken to Precoder 2. The input vector to Precoder 2 is The Precoder 2 is designed to pre-cancel the interference in the first hop (from the BS to the RIS). It guarantees that the signals with appropriate phases and amplitudes are reflected by the RIS. The output signal t x ∈ C N t ×1 of Precoder 2 can be written as where F ∈ C N t ×N s is a precoding matrix. Table 1 shows the first 16 out of 256 DQSM signals.

Reception
The impinging signals at the RIS are defined as where t x ∈ C N t ×1 is the signal vector transmitted by the BS and √ 1/α is an attenuation factor. These signals are reflected by the RIS without adding noise. Then, the received signal by the k-th user in the destination is given by where Θ is the phase shift matrix, defined as Θ = diag(e jθ 1 , e jθ 2 , . . . , e jθ Ns ). Here, e jθ i denotes the phase shift of the i-th reflecting element at the RIS and n k ∈ C N r ×1 stands for the noise. The noise samples are assumed to be independent and identically distributed (i.i.d) with CN (0, σ 2 ). √ γ k is the signal-to-noise ratio (SNR) at the destination. Instead of applying the phase shifts vector at the RIS, in this work, the signals with optimal phases and amplitudes required at the RIS are evaluated and implemented in the BS. Since only precoding is used, the phases in the RIS can be considered arbitrary or fixed. In this paper, the RIS is considered as "blind" with Θ = I.

Channel Model
For the channel model, we consider a flat fading Rayleigh channel where both the BS and the RIS are affected by spatial correlation. Note that mirrors can be optimally selected for a particular user in order to minimize the spatial correlation in the RIS. However, this approach is beyond the scope of this work. The proper operation of the system requires that each user could be illuminated by a subgroup of mirrors of at least N s = N r . If N s > N r , the receptor can take advantage of diversity gains. In order to evaluate the effects of the spatially correlated fading channel, the following Kronecker model is considered [30]: where the elements of the matrix H w are assumed to be independent and i.i.d. complex Gaussian random variables with mean zero and variance one, CN (0, 1). The matrices R r and R t are the receive and transmit correlation matrices, respectively. The correlation matrices are defined using the exponential model as where ρ t and ρ r are the correlation coefficients between adjacent antennas at the transmitter and the receiver sides, respectively. In this work, we consider a correlated channel in the transmission and the reception side of the first hop for all evaluated systems.

Interference Cancellation
As shown in Figure 3, MUI can be present in the system whenever two or more users are under the same beams. However, other users who are far away are free from interference. Since the RIS does not has the capacity to cancel the interference produced in the first hop, a zero-forcing (ZF) technique is adequate for this precoding stage [31,32]. Therefore, the complete transmission block is precoded by using ZF. Additionally, for that user under MUI, a block diagonalization (BD) precoding technique is first used [33]. The BD technique is implemented to design the phase and the amplitude of the reflected signals in the RIS. Note that the ZF technique cancels the channel. However, CSI in the second hop is used to transmit the spatial modulated signals, assuring that the complete information can be retrieved by the users at the destination. For the implementation of the system, two strategies are used. Strategy I is used when some users are under MUI, as shown in Figure 3. Strategy II can be used when all users are under MUI or all users can see the entire RIS.

Strategy I. ZF-BD Precoding
In this strategy, firstly a BD technique is used in Precoder 1 at the BS to generate the signals with appropriate phases required by the RIS in order to cancel the interference in the second hop. Then, a ZF technique is used in Precoder 2 to pre-cancel the interference in the first hop. Figure 4 shows a block diagram of Strategy I. Strategy I can be used when some users are MUI-free. Let us consider a ZF precoding in Precoder 2. The precoding matrix is defined as F = G + . Then, the output can be written as [32] t where G + represent the inverse of the channel matrix G. Substituting (14) in (10) and considering the blind RIS, we obtain Equation (15) represents the signals received by all users. For users that are free from MUI, the received signal is Since y k is free from MUI, only Precoder 2 is used to precancel the interference. For users affected by MUI, and substituting (6) in (15), the received signal y k by the k-th user is where j is the number of users under MUI. Equation (17) can be rewritten as x K The first term in (18) is the signal intended for the k-th user, the second term is the interference produced by the other users in the system, whereas the third term is noise. The mathematical model of the complete system can be represented as where we set √ γ k = 1 without loss of generality. In order to eliminate the interference term, we require that H k W i = 0, ∀i = k, where 0 denotes an all-zero matrix. This equation can be written as where the matrix H k contains all user's matrices in the system except that of the k-th user, i.e., The matrix W k can be obtained by decomposing H k into its singular values as where U k is a unitary matrix, Σ k is a diagonal matrix containing the non-negative singular values of H k with dimension equals to the rank of H k , 0 is an all-zero matrix, V k can be used as the precoding matrix W k in Precoder 1. Note that the BD technique is applied for N t = jN r , where j is the number of users under MUI. Considering (20), the received signal is reduced to which is an interference-free signal. Finally, the complete system is reduced to which shows the cancellation of the undesired components in the destination.

Strategy II. Joint-BD Precoding
In order to improve the flexibility in the design of the systems with N s ≥ N t , a jointprecoding strategy is proposed in this subsection. Strategy II combines the channel matrix G and all channel matrices H k in order to generate an equivalent channel matrix for each user. Then, the BD technique can be used to precancel the interference in both hops at the same time. Figure 5 shows a block diagram of the strategy II utilized.
Strategy II can be used when all users are under MUI or when all users see the complete RIS. Let us consider an equivalent matrix defined as H Eq k (H k ΘG) ∈ C N r ×N t . Then, the received signal for the k-th user in (10) can be rewritten as Considering P = I in the Precoder 2, it follows that t x = x. Now, considering a precoding matrix Z Eq k ∈ C N t ×N r for the k-th user, the transmission vector in Precoder 1 can be written as The received signal by the k-th user can be written as which can be also expressed as where the first term in (28) is the signal sent to the k-th user, the second term is the interference produced by the other users in the system, whereas the third term is the noise. In order to remove the interference term, we require that H  This condition can be written as where the matrix H Eq k contains all users matrices in the system except that of the k-th user, i.e., The matrix Z Eq k is obtained by decomposing H Eq k into its singular values as Similar to (22), Σ k is a diagonal matrix containing the non-negative singular values of H Eq k and the matrix V (0) k contains the last N r columns of V k , which form an orthogonal basis that is in the null space of H Eq k . Eliminating the interference term in (28), the received signal can be rewritten as which is an interference-free signal. Finally, the complete system can be represented as where √ γ k = 1 without loss of generality. Note that, differently from Strategy I, the precoding matrix Z Eq k can be used for arbitrary values of N s . From here on, the systems based on Strategy I and Strategy II shall be referred to as RIS-MU-DQSM-I and RIS-MU-DQSM-II, respectively.

Detection
Assuming that the receiver has perfect knowledge of the channel gains, the maximum likelihood (ML) criterion compares the Euclidean distance between the received signal and all possible noiseless received signals in the system. The optimal ML detection criterion for the RIS-MU-DQSM-I system is defined aŝ where the matrix D ∈ C Nr×2 m accounts for the complete group of possible noiseless DQSM signals in the reception. For those users who are free of MUI, W k = I in (34). For the RIS-MU-DQSM-II system, the ML detection criterion is defined aŝ

Detection Complexity
The detection complexity (η) of the analyzed systems is evaluated by counting the total number of floating-point operations (flops) required for the detection. All systems are compared considering the optimal ML detection criterion. For real additions, multiplications, and comparisons, 1 flop is carried out. For complex additions and multiplications, 2 and 6 flops are carried out, respectively, while subtractions and divisions take the same number as additions and multiplications respectively. Multiplication of m × n and n × p complex matrices uses 8mnp flops. Obtaining Q k = H k W k in (34) requires 8N 2 r N s flops. In order to obtain the noiseless version of the received signal in (34), we multiply Q k ∈ C Nr×Nr by the matrix D ∈ C Nr×2 m , which represent all Rx antennas for the user k. This multiplication requires 8N 2 r 2 m ξ flops. For DQSM, when some spatial symbols are set to zero, the real size of the spatial constellation is reduced in the same proportion. The factor ξ is introduced to take into account the reduction of the size of the spatial constellation in comparison to the conventional QAM constellations. This factor is obtained by counting the number of zero-value inputs in the DQSM constellation.
Subtraction in the ML criterion requires 2N r 2 m flops, obtaining the absolute values requires 3N r 2 m flops, the maximum ratio combining (MRC) for the received signals of each Rx antenna requires 2N r 2 m flops, and ordering all results to find the minimum requires 2(2 m ) flops. Adding the last four partial results, we obtain 7N r 2 m ξ flops approximately. Table 2 summarizes the complexity of partial operation carried out for Strategy I and Strategy II.

Operation Complexity
Finally, adding all these partial results, the detection complexity for the RIS-MU-DQSM-I system is The detection complexity for the RIS-MU-DQSM-II system (35) is obtained as follows: The product of H k ∈ C Nr×Ns by G ∈ C Ns×Nt requires 8N r N s N t flops. Multiplying by the matrix Z Eq k ∈ C N t ×N r requires 8N 2 r N t flops. Multiplying by matrix D ∈ C Nr×2 m requires 8N 2 r 2 m ξ. Similar to (37), ML criterion requires 7N 2 r 2 m ξ flops approximately. Adding all these partial results, we obtain the detection complexity for Strategy II as In (37) and (38), ξ takes into account that the lattice of the DQSM constellation is reduced by the inserted zeros [19]. Then, for the conventional MU-MIMO-SMux system ξ = 1. For the RIS-MU-DQSM schemes, the factor ξ is evaluated by directly counting the entries with zero value in the real or imaginary parts of the spatial constellation. Considering a SE of 8 bits per channel use (bpcu)/user, we obtain ξ = 0.75, meanwhile for an SE of 12 bpcu/user ξ = 0.4375. In the next section, the results of complexity are analyzed and discussed.

Results and Discussion
In this section, we analyze and discuss the BER performance and the detection complexity of the proposed system considering two configurations with different SE and two different scenarios: the uncorrelated and the correlated channels. The results of the proposed system are compared with the conventional MU-MIMO-SMux system [22] and with the recently proposed relay-assisted AF-MU-DQSM system [23] under the same conditions and parameters.

Detection Complexity Results
In order to obtain fair comparisons, all systems are using optimal ML detection criterion. Table 3 shows a comparison of the detection complexity for all systems considering the two configurations used as study cases. The results show that considering the (4 × 2) × 8 × 8 configuration with an SE of 8 bpcu, the RIS-MU-DQSM-I, and the RIS-MU-DQSM-II systems have 25% and 16% lower detection complexity, respectively, when compared with the conventional MU-MIMO-SMux scheme. Considering the (8 × 4) × 32 × 32 configuration with an SE of 12 bpcu, the proposed RIS-MU-DQSM-I and the RIS-MU-DQSM-II systems have 56% and 51% lower detection complexity, respectively, when compared with the conventional MU-MIMO-SMux system used as reference. The AF-MU-DQSM system used as a second reference has the same detection complexity as the RIS-MU-DQSM-II system since they are using the same modulation, precoding, and detection strategies. All systems are evaluated for the optimal ML detection criterion.

BER Performance
For simulations, all systems are using a normalized transmission energy per user, i.e., E[ t x H t x ] = K, the same number of Tx/Rx antennas, and the same SE. Also, all systems are using similar precoding strategies and the optimal ML detection criterion alike. In order to carry out fair comparisons, all systems are compared considering MUI. The channel uses a spatial correlation coefficient of ρ = 0.7 in the BS and the RIS/relays. Tables 4 and 5 show the configuration and the QAM constellation utilized for SE = 8 bpcu/user and SE = 12 bpcu/user, respectively. Note that systems using spatial modulation require lowerorder QAM constellations, while the conventional system requires higher-order QAM constellations to achieve the desired SE. All simulations were carried out using MATLAB SA 2015 ® . Table 4. Simulation parameters for SE = 8 bpcu/user, 4 users, N t = 8, N r = 2, L = 2.

Scheme
Configuration QAM Mod.

RIS-MU-DQSM-I
For the uncorrelated fading channel, and the (4 × 2) × 32 × 8 configuration, the proposed RIS-MU-DQSM-II system has 17 dB and 15 dB BER performance gains when compared with the conventional MU-MIMO-SMux and the AF-MU-DQSM systems, respectively, as shown in Figure 6. For the (8 × 4) × 64 × 32 configuration, the proposed RIS-MU-DQSM-II system has 14 dB and 18 dB BER performance gains when compared with the conventional MU-MIMO-SMux and the AF-MU-DQSM systems, respectively, as shown in Figure 7. As shown in Figures 8 and 9, when the correlated fading channel is considered, the RIS-MU-DQSM-II and the AF-MU-DQSM schemes are affected by 7 dB to 15 dB approximately. Under this scenario, the conventional MU-MIMO-Smux system is affected only by 5 dB. This result can be explained by the reflecting mirrors or the relaying antennas affected by the spatial correlation. On the other hand, the proposed RIS-MU-DQSM-I was only affected by 2-3 dB. It means that the ZF technique is robust under the correlated fading scenario. Note that users who are free of MUI (marked as RIS-MU-DQSM-I*) have a similar BER performance that the other users under MUI. This fact shows the robustness of the BD technique used to avoid interference in the second hop. It is worth noting that when the proposed schemes are using a reduced number of mirrors in the RIS they have similar BER performance than the conventional MU-MIMO-SMux scheme. However, when an increased number of mirrors are used, the proposed RIS-MU-DQSM-II, clearly outperforms both schemes used as reference. Data Availability Statement: Not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.