Outdoor-to-Indoor mmWave Relaying with Massive MIMO: Impact of Imperfect Channel Estimation

: Assuming incomplete knowledge of the channel state information (CSI), we investigate two scenarios involving millimeter wave (mmWave) relaying to support outdoor-to-indoor communications. We proceed to derive the average signal-to-noise ratio (SNR) expressions for two relaying scenarios and quantify the asymptotic SNR. The performance of the two relaying scenarios is evaluated using the outage probability—for which we have derived closed-form equations—the end-to-end channel capacity, and the energy efficiency. The obtained results are compared with those derived assuming complete knowledge of the CSI. The effect of the imperfect CSI is therefore assessed in relation to the reference of perfect CSI. In these scenarios, an outside base station (BS) in an urban cellular network serves several indoor users. In the context of a two-hop full-duplex (FD) relaying scheme, we initially suggest a method in which the base station (BS) utilizes zero-forcing (ZF) precoding, and we take into account the overall channel response. Furthermore, we make the assumption that the base station (BS) engages in precoding only depending on the response of the channel in the first hop; in this second design, the relay precodes (using the response of the second-hop channel), amplifies, and sends the signals. Both techniques utilize massive multiple-input–multiple-output (mMIMO) arrays to permit transmission. We also present Monte Carlo simulation results to assess the accuracy of our analytical results. Finally, the two systems are compared in terms of channel estimation and precoding complexity, the number of antennas, as well as the number of users. Practical deployment recommendations are formulated at the end of this work.


Introduction
A preliminary version of this work was published in [1], and the results obtained previously will be considered as a reference to compare with the new results.Hence, the present work presents an extended version wherein the important and more practical impact of channel estimation is taken into consideration.
The millimeter wave (mmWave) spectrum is an essential element of fifth-generation (5G) and future (6G) wireless communications, as stated in [2].The mmWave technology offers the benefit of increased capacity [3] and more focused beams [4,5], but it comes with the drawback of greater loss of signal when penetrating buildings.The aforementioned disadvantage is often reduced by employing relays to connect indoor users to outdoor transceivers, along with the implementation of scalable massive multiple-input-multipleoutput (mMIMO) configurations [6,7] to enhance signal coverage [8,9], utilizing nearly optimal linear processing techniques (such as zero-forcing (ZF) and maximum-ratio (MR)), and reducing channel estimation complexity (using achieved channel hardening, among other methods).Many mMIMO detection algorithms have been proposed in the literature; ref. [10] provides a survey of those algorithms.
However, full-duplex (FD) relaying, as described in [11], involves nodes that may broadcast and receive at the same time and on the same frequency resource.This is a significant advancement towards fulfilling the spectrum efficiency and system capacity needs of future wireless networks.Although there are still underlying concerns that need to be addressed [12], the directional characteristics of mMIMO (massive multiple-inputmultiple-output) technology operating in the millimeter wave frequency range help to mitigate the challenges associated with implementing full-duplex (FD) relaying.
Power considerations are of utmost importance in this context, particularly because precoding is necessary to prepare the signals for the mMIMO broadcasts.To account for the power limitations at the base station (BS) and the relay node, it is necessary to normalize the precoding matrix by adjusting its power [11,13].In [14], the authors present a mathematical formula for calculating the trace of the zero-forcing (ZF) precoding matrix at the base station.On the other hand, [15] provides the expression of the normalization coefficient at the relay node while adopting different relaying schemes.
There are two main approaches to normalization: vector normalization (VN) and matrix normalization (MN).The authors in [16] present a comparative approach that examines the differences between VN and MN for various linear precoding techniques.Channel estimation is a crucial concern that has been addressed in numerous studies exploring the problem of mMIMO-based mmWave relaying (e.g., [17,18]).According to [14], the complexity of channel estimation can be decreased in a mMIMO system due to the occurrence of channel hardening.Within this specific situation, it is only necessary to use uplink pilot symbols to calculate the channel using reciprocity; in this case, the residual channel estimation error can be disregarded when determining the power normalization coefficient [19].
This paper examines two outdoor-to-indoor millimeter wave (mmWave) full-duplex (FD) relaying strategies considering both perfect [1] and imperfect channels.The performance of these strategies is evaluated based on outage probability, capacity, and energy efficiency while also considering the complexity of channel estimation and precoding for each scheme.In the first design, the base station (BS) employs zero-forcing (ZF) precoding: utilizing the two-hop channel response from end to end.In this scenario, the relay performs the functions of normalizing, amplifying, and forwarding the signals to the indoor users.The base station (BS) predicts the comparable uplink channel, and the downlink channel state information (CSI) is derived by leveraging the principle of channel reciprocity [20].
In contrast, the second technique involves the BS performing precoding only based on the channel knowledge of the first hop.In this second scheme, the relay performs the functions of normalizing, precoding (using the second-hop channel response), amplifying, and forwarding the signals.The relay estimates the second-hop channel, while the firsthop channel is estimated at the BS.The power normalization coefficients are gradually approached as a limit, enabling us to calculate the average and asymptotic signal-to-noise ratios (SNRs) in the presence of both perfect and imperfect signal channel information (SCI).The outage probability expressions are derived for each precoding scheme.
Furthermore, we offer estimations of the level of complexity involved in channel estimation and precoding.We also deduce a relationship between the number of antennas in each scheme to guarantee equal complexity.Therefore, the subsequent performance comparison framework will consider two cases: (i) analyzing performance without taking complexity into account and (ii) choosing different numbers of antennas in both schemes to ensure equal complexity and enable a fair performance comparison.
The rest of the paper is structured as follows.The global system model is introduced in Section 2. Section 3 discusses and analyzes the first scheme, whereas Section 4 provides the same analysis for the second scheme.Section 5 displays the numerical and simulation results that were obtained; it also highlights the performance that was obtained when taking into account different system parameters and constraints.The paper is concluded in Section 6.

Mathematical Notation
In the rest of the paper, we use the following notational conventions: (•) T and (•) H , respectively, for the transpose and the Hermitian transpose, E[•] for the mathematical expectation, ∥ • ∥ for vectors norm, tr [•] for the trace of a matrix, and (•) + for the Moore-Penrose pseudo-inverse of a matrix.

System and Signal Models
As illustrated in Figure 1, we consider N u single-antenna users in a dual-hop downlink system in communication with a BS B equipped with N B antennas via an FD relay node R equipped with N R 1 receive antennas and N R 2 transmit antennas.At time instant i, R receives a data vector y RR and transmits the previously received data, which has been processed, normalized, and amplified y RT .To alleviate the notation, we drop the time index in the remainder of the analysis.Let us note that and H 2 ∈ C (N u ,N R 2 ) are the channel matrix between the BS and the relay receiver and the channel matrix between the relay transmitter and the users, respectively.Both nodes B and R transmit with large numbers of antennas.Assuming that all the paths between the BS and the receiver side of R have the same large-scale fading statistic β 1 , we denote H 1 = β 1 Z 1 , and since a Rayleigh fading channel model is considered in this work, the elements of Z 1 are independent circularly symmetric Gaussian with zero mean and unit-normalized variance.The second hop is modeled similarly, where H 2 denotes the channel matrix of the second hop, and all paths between the relay transmitter and the individual users have the same large-scale fading statistic β 2 , i.e., H 2 = β 2 Z 2 , where the elements of Z 2 are independent circularly symmetric Gaussian with zero mean and unit-normalized variance.
We denote by q the N u × 1 vector of user symbols, with E[qq H ] = I N u .The transmitted vector x is then given by x = where P is the source power budget for each user, and F 1 is the precoding matrix performed such that E[∥ x ∥ 2 ] = N u P. The received signal at R is given by: where is the matrix representing the self-interference channel (taking into consideration the advantages of mMIMO antenna array directivity, we neglect the selfinterference term in the remainder of the analysis [21]), y RT is the data vector transmitted by R during the same time interval (containing data received in the previous time interval, which has been processed, normalized, and amplified), and n 1 represents zero-mean additive white Gaussian noise (AWGN) at the relay's receiver side with variance σ 2 1 .The processing at the relay consists of a linear signal processing unit, which is denoted by the relaying matrix F 2 and results in a processed version expressed as The relay then normalizes and amplifies the signal before forwarding it to the indoor users.Let α denote the normalization coefficient (in the remainder of the text, to avoid confusion, we denote the normalization coefficients by α 1 and α 2 when analyzing the first and second, relaying scheme, respectively) given by With a power budget of P R at the relay, the transmitted signal is then expressed as Finally, user k receives where n (2,k) represents the zero-mean AWGN with variance σ 2 2 at user k, h (2,k) and h (D,k) denote, respectively, the k-th lines of matrices H 2 and H D , and H D ∈C (N u ,N B ) represents the direct link channel matrix.Due to the blockage of mmWave signals by the outdoor-toindoor separation, the direct link is subsequently neglected [22].

Case of Perfect CSI
In this case, the relay is simply designed as an all-pass amplify-and-forward (AF) unit, i.e., N R 2 = N R 1 = N R and F 2 = I N R , while the BS performs precoding based on the end-to-end channel.Let us denote H eq = H 2 × H 1 = N R β 1 β 2 Z eq .According to the central limit theorem, the elements of Z eq are complex Gaussian with zero mean and unit variance.As in [14], the ZF beamforming matrix is given by , where Z + eq = Z H eq (Z eq Z H eq ) −1 .To characterize the complexity of this first scheme, we consider the complexity analysis framework presented in [23] wherein only multiplications are taken into account.Therefore, the precoding complexity of Scheme 1 can be directly approximated as O(2N B N 2 u + N 3 u ).

Asymptotic SNR Analysis
In the following, we first give the asymptotic value of α in the large number of antennas regime; then, we derive the instantaneous, average, and asymptotic SNR expressions.
With Scheme 1, the normalization coefficient is expressed as which, after a few manipulations, yields Based on [24] (Lemma 1), when N R = N B = N, we obtain and Therefore, the normalization coefficient α 1 asymptotically goes to and since the signal received by user k is expressed as the end-to-end instantaneous SNR can be expressed as the average SNR can be expressed as and, in the large number of antennas regime, the end-to-end SNR at user k can be expressed as Equation (12) shows that the asymptotic SNR for Scheme 1 goes to infinity as N → ∞.This means that the undesirable effects from relay and user noise disappear when N becomes large, and only the useful signal is dominant.It is well known that one advantage of using mMIMO is that power scaling can improve the energy efficiency while maintaining a desired capacity due to the large diversity gain of the large antenna array.Therefore, we propose and analyze a few typical power scaling options for Scheme 1.The asymptotic SNRs with power scaling (P = E/N a , P R = E R /N b , 0 ≤ a, b ≤ 1) are summarized in Table 1.Note that E and E R are fixed regardless of N. Finally, we can straightforwardly express the system's capacity and asymptotic capacity, respectively, as C asym 1 ).
Table 1.Asymptotic SNRs with different power scaling-Scheme 1 and perfect CSI.

. Outage Probability Analysis
In this subsection, we derive the outage probability of Scheme 1 using ( 10) and ( 8).The instantaneous SNR can be rewritten as where , and X and Y are two independent chi-squared distributed variates with two degrees of freedom [25].Note that λ 1 X and λ 2 Y are gamma variables with shape parameter 1 and respective scale parameters a 1 = 1/2λ 1 and a 2 = 1/2λ 2 .Then, by following the same approach as in [25], we obtain the cumulative distribution function (CDF) of the approximate SNR in Scheme 1 as This yields the outage probability expression that we will evaluate numerically later in Section 5.

Case of Imperfect CSI
In this subsection, we consider the case of imperfect CSI, for which the channels are estimated using a minimum mean square error (MMSE) approach, and we re-derive the SNR and the outage probability expressions obtained in the previous subsection.For Scheme 1, we assume that the relay amplifies and forwards the received pilots, and only the BS estimates the equivalent end-to-end channel.

Pilot Symbol Transmission
We denote by τ c the number of samples that can be sent during a channel coherence interval, where τ c = B • T, with T being the duration of the coherence interval in which the channel is considered to remain time-invariant, and B is the bandwidth of the waveform, for which the frequency response of the channel is considered to be flat.Each coherence interval hosts N u orthogonal pilot waveforms of length τ P , where N u ≤ τ P ≤ τ c [14].Note that the BS estimates the equivalent end-to-end channel for each user based on the received uplink pilots.
Let us denote by Φ a τ P × N u unitary matrix that contains the pilot symbols, such that and all the users simultaneously transmit N u signals of duration τ P and of the form [14] The relay receives the pilot symbols and amplifies and retransmits the signal to the BS, where the received signal can be expressed as where N 3 represents the zero-mean AWGN matrix at the relay, the elements of which all have a variance σ 2 3 .Finally, let us denote by α P = tr(Y PR Y H PR ) the power normalization coefficient, and the relay hence transmits We can note that α P has the same structure as α 1 and can be approximated as: 3.2.2.Despreading at the BS During this phase, the BS uses the pilot symbol matrix Φ for despreading the signal.Since the received pilot signal at the BS is given by where N 4 represents the zero-mean AWGN matrix at the BS, for which the elements all have a variance σ 2 4 , this yields a received signal with the form

MMSE Channel Estimation
Let us denote by N a ≜ ( √ 1/α P H T 1 N 3 + N 4 )Φ the aggregate noise.The elements of N a are independent and identically distributed (i.i.d.) circularly symmetric Gaussian variables, i.e., N a (i, j) ∼ CN (0, N R β 1 σ 3 2 /α P + σ 2 4 ), where i and j denote the i-th line and j-th column, respectively, of N a (i, j).Referring to [26], the estimated channel between the BS and the users is thus given by with ∆ = (τ P β e + N R β 1 σ 2 3 )/α P + σ 2  4 , and It is worth noting that the variance of the elements of H T eq can be expressed as and denoting by E ≜ H eq − H eq the error estimation matrix, the variance of the elements of E is β e − ζ 2 e , and 0 ≤ µ e ≜ ζ 2 e β e ≤ 1 denotes the channel estimation reliability.The computational complexity of channel estimation was studied and approximated for several channel estimation methods in [27].Referring to it, we approximate the MMSEbased channel estimation complexity as ON 3 B τ 3 p .This complexity will be added to the precoding complexity already computed in Section 3.1.

Asymptotic SNR Analysis
Here, zero-forcing precoding is based on the estimated channel in (24), and the precoding matrix is given by , where Z eq ≜ (1/ζ e ) H eq is the normalized estimated equivalent channel.The transmitted signal vector is thus given by and the relay receives the following signal: Let us denote α 1 = tr( y RR y H RR ), which, after a few manipulations, can be written as To alleviate the derivations, we write Z eq as Z eq = 1 ζ e (H eq + E).When N R = N B = N, we obtain the following convergences: Referring to (25), we can express µ 2 e as µ 2 e = . This yields an expression of the signal received by the user under the form Under these notations, we can derive the instantaneous SNR for user k as where E (k,:) is the k-th line of the channel estimation error matrix.Then, the average SNR can be expressed as and the end-to-end SNR in the large number of antennas regime at user k can be expressed as From (33), we can see clearly that the main influence of the channel estimation reliability µ 2 e is the decrease in the capacity of Scheme 1 due to the channel estimation error.Some typical power scaling options for Scheme 1 under imperfect CSI are summarized in Table 2.We observe that µ 2 e does not change the power scaling effects on Scheme 1 as N increases, which will also be verified in the numerical results in Section 5.

Power Scaling Parameters
Asymptotic SNR (N → ∞) Finally, we can straightforwardly express the system capacity and asymptotic capacity, respectively, similarly to ( 13) and ( 14).

Outage Probability Analysis
In this subsection, we derive the outage probability of Scheme 1 using (31) and ( 29).The instantaneous SNR can be rewritten as where , and X, Y, and Z are three independent chi-squared distributed variates with two degrees of freedom.Note that λ 1 X, λ 2 Y, and λ 3 Z are gamma variates with shape parameter 1 and respective scale parameters b 1 = 1/(2λ 1 ), b 2 = 1/(2λ 2 ), and b 3 = 1/(2λ 3 ).Then, by following the same approach as in [25], we obtain the CDF of the instantaneous SNR in Scheme 1 under imperfect CSI as This yields the outage probability expression that we will evaluate numerically later in Section 5.

Case of Perfect CSI
In this second scheme, hereafter called Scheme 2, the base station precodes the signal using the channel response of the first hop; then, the relay receives, without interference, the users' signals with N u antennas, precodes them based on the channel response of the second hop, and normalizes, amplifies, and retransmits the signals to the users.In this case, N R 1 = N u and N R 2 = N R .The precoding matrix at the base station is given by

and the relay processing matrix is given by
Therefore, the precoding complexity of Scheme 2 can be approximated as O(2N Assuming the same large number of antennas at both the base station and the relay, i.e., N R 2 = N B = N, the complexity can be rewritten as 2O(NN 2 u + N 3 u ).Under these considerations, we derive the normalization coefficient as

Asymptotic SNR Analysis
Based on [24] (Lemma 1), when N R = N B = N, we obtain Then, for the normalization coefficient The received signal at user k is given by where n (1,k) is the k-th element of vector n 1 .Now, we can derive the instantaneous and average SNRs, respectively, as When N B = N R = N and taking into consideration (38), the asymptotic SNR for this scenario can be expressed as As we have observed in (12) for Scheme 1, Equation ( 42) also shows that the asymptotic SNR in Scheme 2 grows to infinity as N → ∞.The asymptotic SNRs for a few typical power scaling options are summarized in Table 3.
Table 3. Asymptotic SNRs with different power scaling-Scheme 2 and perfect CSI.

Power Scaling Parameters
In Table 3, we analyze a few typical power scaling options for Scheme 2 under perfect CSI to take advantage of using mMIMO and to improve the energy efficiency while maintaining the desired capacity.Therefore, we propose some practical cases of asymptotic SNR according to specific values of a and b.
From Tables 1 and 3, we see clearly that Scheme 2 outperforms Scheme 1 for all power scaling options.Moreover, for a large number of antennas, increasing a and b leads to a dramatic reduction in the performance of Scheme 1, i.e., a + b ≥ 1.
Finally, we derive the system capacity and asymptotic capacity, respectively, as ). (44)

Outage Probability Analysis
We adopt the same approach as for Scheme 1 and derive the CDF of the approximate SNR in Scheme 2 as where ), and c 2 = 1/α 2 σ 2 2 .

Case of Imperfect CSI
Here, for Scheme 2, we adopt a cascaded channel estimation, wherein the relay estimates the second-hop channel response and transmits pilot symbols to the BS, which estimates the first-hop channel response.

MMSE Channel Estimation
First, the second-hop channel estimation is given by where Y P,1 is the pilot symbol matrix received by the relay after despreading.Similarly to the perfect CSI case, let us denote by ζ 2 2 the variance of the elements of H T 2 , which is given by and let E 2 ≜ H 2 − H 2 denote the channel estimation error matrix over the second hop, for which the elements' variance is given by β 2 ≤ 1 denotes the channel estimation reliability in this case.
Second, and in order to estimate the first-hop channel, the relay sends its own pilot symbols to the BS, which obtains an estimation given by where Y P,2 is the pilot matrix received by the BS after despreading.Again, ζ 2 1 denotes the variance of the elements of H T 1 and is given by E 1 ≜ H 1 − H 1 is the channel estimation error matrix over the first hop, for which the elements' variance is given by In this case, the channel estimation complexity is approximated as

Asymptotic SNR Analysis
In this case, zero-forcing precoding is based on the estimated channels in (46) and (48), with precoding matrices 2 ) H 2 are the normalized estimated channel matrices of the first and the second hops, respectively.
Under these notations, the signal received at the relay side can be expressed as In this case, α 2 can be expressed as To alleviate the derivations, we write Z 1 as we obtain

and since
Thus, the signal received by all the users is given by Similarly to the analysis in Section 3.2.4 and after a few simple manipulations, we can derive the instantaneous and the average SNRs, which are given, respectively, by where E 1 (k,:) and E 2 (k,:) are the k-th lines of channel estimation error matrix E 1 and E 2 , respectively, and The end-to-end SNR in the large number of antennas regime at user k can be expressed as Intuitively, Scheme 1 is more sensitive in terms of channel estimation reliability compared to Scheme 2. This is mainly due to the pilot noise amplification at the relay.A few typical power scaling options for Scheme 2 under imperfect CSI are summarized in Table 4.
Table 4. Asymptotic SNRs with different power scaling-Scheme 2 and imperfect CSI.
Finally, we can express the system capacity and asymptotic capacity as in ( 43) and (44), respectively.

Outage Probability Analysis
Let us now derive the outage probability of Scheme 2 under imperfect channel estimation using (54) and (52).The instantaneous SNR can be rewritten as where , and X, Y, W, and Z are independent chi-squared distributed variates with two degrees of freedom.Then, we obtain the CDF of the approximate SNR in Scheme 2 under imperfect CSI as where ), and d 4 = 1/(2λ 4 ).This yields the outage probability expression that we will evaluate numerically later in Section 5.

Numerical Results
Using Monte Carlo simulations, we now validate our analytical analysis and give a comparative discussion of both schemes taking into consideration the precoding complexity.Without loss of generality, we assume that Please note that in all figures that will be presented in this section, the markers represent the results of simulations, while solid lines represent the analytical and asymptotic results under the perfect CSI assumption, and the dashed lines represent the results with imperfect MMSE channel estimation.

Capacity and Energy Efficiency
In this subsection, we first examine the capacity and the energy efficiency (EE) of both schemes, where EE is defined as EE = C/(P + P R ). (For simplicity, we only consider transmission energy here, as it is the major component, and we neglect other energy consumption in the system.An exhaustive energy efficiency framework will complement this work in the future.)Figures 2 and 3 show the simulated capacity together with the average and asymptotic capacity for different values of a and b, while Figures 4 and 5 show the energy efficiency for different values of a and b.All results are given for both perfect and imperfect CSI.Clearly, the asymptotic values presented in Tables 1-4 can perfectly predict the performance of Scheme 1 and Scheme 2 under perfect and imperfect CSI.In fact, as N grows infinitely large, the capacity when a = b = 0 and a = b = 0.3 has no upper bound.In the other cases, as N increases, the capacity increases towards a constant asymptotic value-except for a = b = 1 with Scheme 1, where it decreases towards zero.Moreover, as discussed in Section 4.1.1,Scheme 2 outperforms Scheme 1 for all power scaling options when N is large enough.Further, observing Figures 4 and 5, we can conclude that Scheme 2 is the best in terms of energy efficiency when combined with relevant power scaling.
However, in terms of complexity, Scheme 2 is worse than Scheme 1.Hence, for a fair comparison, and in order to get the same complexity, we select different values of N for each scheme in what follows.Let us denote by N 1 and N 2 the numbers of antennas for Scheme 1 and Scheme 2, respectively.Based on the complexity approximations for channel estimation and precoding for each scheme, and in order to get the same complexity for both schemes, the following constraint is respected: and Figures 6-9 show the capacity and energy efficiency for both schemes with the same complexity level.We clearly see that with the additional complexity constraint, Scheme 1 now always outperforms Scheme 2 for the perfect CSI condition-except when the power scaling at the relay is very high, i.e., b = 1.However, under imperfect CSI, Scheme 2 remains a better choice at the same complexity level for all proposed a and b values.

Outage Probability
In this subsection, we compare the studied schemes in terms of outage probability, and for space considerations, we limit our analysis to the case of power scaling at the relay, i.e., a = 0 and b = 1.Moreover, in all figures, we assume that both schemes have the same complexity, i.e., the constraints in (59) and (60) are respected.In Figure 10, we first note that the simulation results confirm the accuracy of the analytical expressions obtained in Sections 3.1.2,3.2.5, 4.1.2,and 4.2.3.We conclude that Scheme 1 outperforms Scheme 2 in term of outage probability when assuming the same complexities and a perfect CSI for some numbers of transmitting antennas.Under imperfect CSI, we see that Scheme 2 is always the best.

Conclusions
In this paper, we have analyzed the end-to-end capacity, its asymptotic approximation, and the outage probability of an outdoor-to-indoor dual-hop full-duplex mmWave multiuser system for which the transmitting nodes are equipped with massive numbers of antennas while the final users are equipped with single antennas.Two precoding schemes were proposed, and the performance metrics were derived for each scheme under both perfect and imperfect CSI conditions.An approximation of the precoding complexity was also given for each precoding scheme.To respect the power budget allocated to the relay, the normalization coefficient was approximated, capitalizing on the law of large numbers.Monte Carlo simulation results were presented to confirm the accuracy of our analysis.Practical recommendations were formulated at the end of the analysis for both pure performance and for fair performance-complexity trade-off comparisons.

Table 2 .
Asymptotic SNRs with different power scaling-Scheme 1 and imperfect CSI.