Secrecy Analysis and Error Probability of LIS-Aided Communication Systems under Nakagami-m Fading

Large intelligent surfaces (LIS) are a new trend to achieve higher spectral efficiency and signal-to-noise ratio in mobile communications. For this reason, this paper proposes metrics to analyze the performance of systems with multiple antennas aided by LIS and derive the spectral efficiency, secrecy outage probability, and bit error probability in an environment with Nakagami-m distributed fading. In addition to an eavesdropper, there is a single-antenna user, an array of antennas at the transmitter side and the possibility of a direct link between transmitter and receiver. This study assumes that the LIS performs non-ideal phase cancellation leading to a residual phase error that follows a Von Mises distribution, and shows that the resulting channel can be accurately approximated by a Gamma distributed SNR whose parameters are analytically derived. From these formulas, it is possible to evaluate the effect of the strength of the line-of-sight link by varying the Nakagami parameter, m.


Introduction
Large Intelligent surfaces (LIS) are a promising technology for beyond fifth-generation (B5G) systems, given the number of papers emphasizing their advantages, whether compared to relays [1] or even when used to enhance the power of millimeter wave technologies [2]. Furthermore, reflecting signals with extreme precision and without power consumption can reduce the interference and improve the signal-to-noise ratio at the receiver, especially when the direct path between transmitter and destination is weak and needs to be strengthened.
In addition, known as large reflecting surfaces, they have recently been studied as a solution for different modulation schemes and communication channels. Their performance metrics show their significant potential for mobile communications. For example, Yang et al. [3] proposed a transmission protocol to reduce the channel estimation overhead when adjacent cells share the same reflection coefficients. In addition, optimization methods are used to allocate the transmit power and maximize the achievable rate in an orthogonal frequency division multiplexing (OFDM) scheme under frequency-selective channels.
In [4], Basar presented a mathematical framework to obtain the signal-to-noise ratio and derive the symbol error probability of an LIS-aided communication system, with or without knowledge of the channel phases. The author also proposed an access point sending signals directly to the users aided by a LIS system. Wymeersch et al. [2] emphasized that, although there are already other techniques for high frequencies (0.1 to 1 THz), these technologies are limited by multipath propagation and obstacles presented in the environment. In this case, LIS can control the physical propagation environment, decrease energy consumption, and simplify location and mapping systems, creating a line-of-sight (LoS) path between transmitter and receiver.
In [5], the authors presented solutions for the adjustment of the LIS elements' phases, which optimizes the channel capacity and the precoder applied on the transmitter side. Elbir et al. [6] developed a deep learning framework to obtain the channel state information (CSI) in a massive multiuser MIMO system aided by a LIS. The authors estimated each user's composite channel and the direct path through a convolutional neural network whose inputs are the received pilot signals. Lin et al. [7] performed channel estimation by applying Lagrange multipliers and a dual ascent-based scheme iteratively. They also found a closed-form solution for Cramer-Rao lower bounds and proposed a method that improves the accuracy of the classical least-square method. Taha et al. [8] presented an energy-efficient architecture where all the LIS's elements are passive except for a few distributed active elements that are arranged in a non-uniform manner. The reflector array applies deep learning models to obtain the optimal matrices of phase shifts.
Although an LIS is usually a panel of reflectors physically organized in planar shapes, Hu et al. [9] proposed alternative structures with a three-dimensional spatial configuration with spherical surfaces. In addition to broader coverage, they have a more straightforward positioning system when compared to the conventional planar arrays.
LIS must be large in far-field communications to compete with classic massive MIMO systems and compensate for multipath propagation and electromagnetic interference. Besides that, optimizing the phase shifts associated with each element of the LIS is a great challenge. Therefore, in [10], Najafi et al. proposed an optimization method based on the physical modeling of the propagation and clusterization of a thousand reflectors into small subsets, also known as tiles. Based on concepts from radar communications, they modeled the impact of each tile on the overall channel, calculated the associated electric and magnetic fields, and showed that it is possible to optimize the operation of the LIS to maximize some quality of service (QoS) criteria.
On the other hand, Garcia et al. [11] focused on near-field environments and established a relation between the array size and the Fresnel zones. The punctual approximation of the scattering characterization presented dependence on the second and third-order moments of the distance. On the contrary, for far-field, the dependence is given for the fourth power. Kishk et al. [12] employed some stochastic geometry tools to analyze the effect of the large-scale deployment of LIS on the performance of cellular networks in the presence of blockages surfaces. They established a relation between the density of LIS panels and blockages.
Mukherjee [13] explores the idea of integrating LIS with mobile edge computing (MEC) technology that intends to leave computing involved in processing the received signal to a cloud server and describes how these technologies can mutually benefit and create a framework competitive for 6G. Finally, Malandrino et al. [14] analyze the possible benefits of using intelligent and reflective surfaces to increase the privacy and security of mobile communications through secrecy rate, considering that passive eavesdroppers are involved in the system, in addition to legitimate users.
In addition to the works related to optimal estimation and power control in transmission systems aided by LIS, it has become a trend to compute the channel's capacity in the face of eavesdroppers. The question to answer is: "Does such a system offer the physical layer security that prevents an intruder from receiving a signal not intended for him?" The secrecy outage probability metric can answer this question since it means the probability that the instantaneous secrecy capacity is less than or equal to a given capacity threshold. Below are some works that, like ours, are concerned with information security in systems assisted by LIS.

Related Works
For the case of Gaussian distributed channels and considering parameters such as the distances between devices and the number of LIS elements, Yang et al. [15] derived closed-form expression for the secrecy outage probability (SOP) assuming that the LIS uses CSI to implement the phase shifting perfectly. In its turn, Trigui et al. [16] assumed a more realistic model in which there are errors caused by phase quantization. By leveraging Fox's H transforms, they obtained exact SOP expression under the assumption that many reconfigurable elements of LIS and channels were distributed according to the Rayleigh distribution.
On the other hand, Ai et al. [17] demonstrated the potential of improving secrecy with LIS aid under different scenarios where a passive eavesdropper is attempting to retrieve the transmitted information: a vehicular-to-vehicular and a vehicular-to-infrastructure. Makarfi et al. [18] showed how the source power, eavesdropper distance, the number of LIS elements, the source-to-relay distance, and the secrecy threshold affect the secrecy capacity and SOP when the vehicular source uses an LIS as an access point.
Following the perspective of the physical layer security, this paper analyzes the secrecy outage probability of a LIS-assisted system in which K antennas at BS transmit simultaneous signals to only one user. As shown in [19], the overall fading coefficient is approximately gamma distributed, even for small values of N and K, but only when the Nakagami-m fading channels have m = 1 (Rayleigh distribution). The reasoning is extended here to more general scenarios in which m assumes arbitrary values. To the best of the authors' knowledge, this is the first analysis covering both channels with and without a line of sight. The derived closed-expressions for bit error probability (BER) and SOP allow us to conclude that it is possible to evaluate the system performance and design it without performing several Monte Carlo simulations that would be computationally costly in a scenario with multiple antennas and multiple reflectors. The use of the gamma approximation is investigated for a more general scenario, in which it is possible for a line of sight path to exist or not in each one of the intermediate channels (i.e., paths between the transmitter and the LIS, and between the LIS and the user).
In contrary to our previous work [19], this study focuses on secrecy analysis and extends the system model to near-field scenarios. The presence of an unwanted eavesdropper link is a realistic consideration since the information leakage becomes increasingly worrisome, especially for banking, corporate, and government communications in addition to demonstrating the validity of the proposed bit error probability approach when analyzing environments with Nakagami-m fading.
The paper is organized as follows: Section 2 presents the system model and the initial equations that based the formulation of the problem, while Section 3 presents the closed-form expressions for spectral efficiency, BER, upper bound, and SOP. Finally, Section 4 demonstrates the validity of the proposed analytical expressions through Monte Carlo simulations and Section 5 presents the final considerations. Demonstrations and mathematical deductions are presented in Appendix A.

System Model
As shown in Figure 1, this study considers a base station (BS) equipped with an antenna array of K antennas transmitting the same signal to a unique single antenna user, the destination. Additionally, a large intelligent surface system with N reflecting elements aids the system. Both channels BS to LIS and LIS to the user are modeled by the Nakagami-m distribution. There is a direct link between the user and the BS and between an eavesdropper and the BS whose channel is also Nakagami-m distributed. The signal that arrives at the destination antenna is given by where h SL ∈ C N×1 is the link between the source and the LIS, h LD ∈ C K×N is the link between the LIS and the destination and h SD ∈ C K×1 is the direct link between the source and the destination. The term Φ ∈ C N×N is a diagonal matrix, whose elements are the phase shifts e −jφ 1 . . . e −jφ N applied by the LIS to the incident electromagnetic waves. The LIS's phases, φ n ∀n, are assumed continuous in the interval of 0 to 2π radians. The term Ψ = vs represents the precoded signal, where the data symbol is s ∼ CN (0, 1) and the optimal precoding vector is applied by BS, according to the MRT (maximum ratio transmission) criterion, i.e., ... Finally, the term η ∼ CN (0, 1) is additive white Gaussian noise (AWGN) with zero mean and unit variance. Suppose that there is no LoS in the direct link and that it is modeled as a complex normal random variable, with zero mean and variance σ 2 SD . Additionally, the magnitude of the channels h i = |h i |e jφ i with i ∈ {SL, LD} are Nakagami-m distributed with probability density function (PDF) given by In this work, the parameters m i and Ω i refer to the shape and spread of the Nakagamim PDF, respectively. The distribution of the phases is not specified since, for this model, these phases are not relevant. Then, the overall channel, including the LIS and the antenna array, can be defined as whose representation in scalar form is Perfect phase cancelling occurs when However, the task of removing the overall channel phase is unfeasible. Some residual phase noise is left behind, in this case, where θ ki is the phase noise, which, in this work, is modeled as a Von Mises random variable with concentration parameter κ. Therefore, the overall channel can be written as It is expected that there is no phase error in the best case analysis, but this situation is entirely unfeasible. However, it is possible to estimate an optimal phase adjustment matrix that provides a performance as good as possible, so it is expected that, on average, the phase errors are zero. The zero mean Von Mises circular distribution can be proper to model the phases of each antenna's fading coefficients [20]. It has nonzero support in the interval −π and π and a concentration parameter κ associated with the quality of the phase adjustment promoted by the LIS and the efficiency of the channel estimation method.
The moment-generating function (MGF) of the Von Mises distribution is useful since a complex exponential represents the phase adjustments. With the MGF, it is possible to calculate the statistical moments associated with the channel coefficients.
Let X be a Von Mises random variable; therefore, its MGF is given by ϕ p = E[e −jpX ] = α p + jβ p . Since the zero mean Von Mises distribution is symmetric about zero, then the imaginary part of the MGF β p = E[sin pX] = 0, and the real part is α p = is the modified Bessel function of first kind and order p.
Considering that the precoder is the normalized hermitian of the overall channel, the SNR of the desired link is Assuming, as an approximation that γ D is Gamma distributed, then its statistical moments, α and β can be estimated as where α and β are the shape and rate parameters, while E[γ D ] and var(γ D ) are the expected value and variance of γ D , respectively, as shown throughout Appendix A.
The assumption that the distribution of γ D is Gamma distributed can be assessed using the Hellinger distance. According to Beran [21], the Hellinger distance between two arbitrary discrete probability distributions p k and q k can be obtained as where N p is the number of samples available to calculate the distance. The Hellinger distance is limited in the interval 0 ≤ D HL ≤ 1 and can be considered as an absolute metric. In Figure 2, the realizations of a Monte Carlo simulation of the channels involved in the system are used to compose a histogram that approximates the PDF of the overall channel that is compared to the Gamma distribution predicted by the approximation proposed in this study. To perform the analysis, 10 6 iterations were performed with unit variance for all channels, Von Mises concentration parameter κ = 2 and the Nakagami parameter m = 2. The results show that the Hellinger distance decreases when N and K increase. In the last case, for K = 16, the decrease is even more pronounced. Therefore, this accurate approximation motivates us to formulate the problem further.

Problem Formulation
Knowing that the SNR can be approximated by a Gamma random variable, closed-form expressions for spectral efficiency, BER and SOP are derived in the following subsections.

Spectral Efficiency
The average spectral efficiency of the system can be defined as whose approximated solution is given by where ψ (0) (.) is the digamma function, Γ(.) is the gamma function, Γ(., .) is the incomplete gamma function, and 2 F 2 (a, b; c, d; e) is the generalized hypergeometric function. It is noteworthy that, although this study did not find an explicit solution for the spectral efficiency, it does present a more generic solution for the integral in the reference [22].

Bit Error Probability
The error probability for the M-QAM modulation can be approximately obtained by [23] P QAM Assuming, as an approximation that γ is Gamma distributed, the mean bit error probabilityP QAM e can be calculated bȳ where f w 2 (v) is the pdf of w 2 as a function of an independent variable v.
Ferreira et al. [19] derived a close upper bound for the mean error probability of an M-QAM schema under Gamma fading by using the approximation From the Chernoff bound Q(x) ≤ 1 2 e − 1 2 x 2 , they obtained the following upper bound for BERP which is close to the exact solution.

Secrecy Outage Probability
Considering that an eavesdropper has access to the signal provided by the source and according to [24], the secrecy capacity associated with the two fading channels can be obtained as where γ E is the SNR of the link between the source and the eavesdropper. Therefore, the SOP is defined as the probability that the instantaneous secrecy capacity, C, be less than or equal to a given capacity threshold, ln (1 + γ th ), which is expressed as (17) where Pr[.] denotes the probability of a random event.
Considering a Nakagami-m distributed eavesdropper channel, the SOP can be obtained as follows: Solving the first integral, the remaining expression becomes where the term Γ(α) − Γ(α, β(x + γ th (1 + x))) can be rewritten as function of the lower incomplete gamma function considering that Γ(s) = γ(s, x) + Γ(s, x).
Representing the exponential in terms of power series, it follows that the incomplete Gamma function can be written as Applying the expansion (21) in (19), and using the result obtained by the reference [22] in its table of integrals for integrands of type x a e −px 2 γ(ν, cx), the SOP can be rewritten as (22), where pFq is the regularized p F q hypergeometric function and v = 1+γ th γ th . Although this expression is an infinite sum, it is possible to verify, in Section 4, that the error is small if only the first term is considered to compute the SOP:

Numerical Results
This section analyzes the accuracy of the proposed approximations and discusses the improvements in capacity provided by LIS. In unspecified cases, this study adopts, by default, the Nakagami-m shape parameters m SL = m LD = m = 2. The spread parameters Ω SL = Ω LD = Ω were chosen to make the variances σ 2 SL = σ 2 LD = 1, the Von Mises concentration parameter κ = 2, K = 16 antennas at the source, the size of the M-QAM constellation is M = 16, and the number of iterations is 10 6 for each Monte Carlo simulation.
For each iteration of the Monte Carlo method, we generate the coefficients h k of (6), the magnitudes are generated using the Nakagami-m distribution and the phase errors with the Von Mises distribution. Given the coefficients, it is easy to estimate the bit error rate, spectral efficiency, and the SOP in each realization of the random variables and approximating the simulated results by the mean value of these quantities. We compare each of the simulated results in several iterations with the theoretical formulas described in terms of channel parameters. Figure 3 shows the simulated and theoretical BERs considering the Von Mises and uniformly distributed phase errors. The theoretical BER is obtained assuming that the overall fading channel has a Gamma distribution. Note that the larger the number of reflectors, N, the smaller the error probability for any SNR value. When the phase errors are uniformly distributed (κ = 0), the error probability is higher than in the Von Mises scenario. This result shows the importance of accurately estimating the phases and channel gains and choosing the optimization method to find the best LIS phase shifts. Uniformly distributed phase noise indicates that the algorithm has equal chances to present significant phase errors (close to ±π) or small phase errors (close to zero). That implies greater bit error probabilities, which can be compensated only with a large number of antennas at the transmitter or with a large number of reflectors at the LIS.  In its turn, Figure 4 confirms that large reflecting surfaces can produce an LoS link between the transmitter and the user even in a far-field Rayleigh fading channel. However, in a near-field scenario, a stronger LoS link (higher Nakagami−m parameter) implies a lower probability of error.  Even in weak LoS scenarios, LIS can decrease the probability of bit error by creating an LoS that is the result of beamforming toward the target user. Figure 3 shows that, for an environment with a fixed value of m, the increase in the number of reflectors (N), or the improvement of the phase adjustment performed by the LIS (related to the concentration parameter κ) can reduce the bit error rate in an aided LIS system.
The upper bound (15) for the error probability proposed by Ferreira et al. [19] is very close to the bit error rate as shown in Figure 5, even when the fading coefficients are Nakagami-m distributed.  Regarding spectral efficiency, this study considers two scenarios. The first one has uniformly distributed phase noise, and the second one has a Von Mises distributed phase, as shown in Figure 6. Notably, the spectral efficiency increases when the LIS has a more significant number of reflectors, thus indicating a better sharing of the spectrum for the transmission of signals for a multiuser scenario. Moreover, the efficiency is higher for the case in which the phase errors have a Von Mises distribution and lower when the phase errors are uniformly distributed, which means that the phase adjustment of the LIS is a highly relevant factor in improving the spectral efficiency, reinforcing the importance of channel estimation and choice of the phase correction applied to reflectors. When the phase error distribution is more concentrated around zero (higher κ values), then the spectral efficiency is higher. Using an array of antennas on the base station can be a good choice to achieve better spectrum sharing in diverse scenarios. It is also remarkable that the result predicted by the formula proposed for the spectral efficiency is very close to the results obtained by the Monte Carlo simulation. In Figure 7, the secrecy outage probability for a Nakagami−m eavesdropper link with Ω = 1 and m = 1.4 is shown. The sum was truncated up to the index 1000, and the number of iterations used was 10 6 to generate the Gamma distributed random SNR with parameters α and β, the Von Mises concentration parameter κ = 2, K = 2 antennas, unity variance, and Nakagami-m fading distribution for all channels between the antennas, the LIS, and the user. The larger the number of reflectors or the SNR, then the greater is the SOP.  The first-order approximation of the SOP, considering that the Nakagami-m parameters are m = 2.5 and Ω = 0.1 for all the channels in the system model, is also close to the simulated result as shown in Figure 8.

Final Considerations
This work has presented an in-depth analysis of the performance of systems aided by large intelligent surfaces considering the existence of an eavesdropper link in generic scenarios that contemplate channels with and without LoS links, employing the Nakagamim distribution, and channels with or without a direct link to the transmitter and the user. This study derives very accurate analytical expressions from computing the secrecy outage probability, bit error probability, and secrecy capacity, in addition to reasonable approximations for estimating the equivalent channel parameters based on the central limit theorem.
Author Contributions: All authors contributed equally to this study working in the bibliographic review, calculating the mathematical expressions, executing the simulations, and analysis. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Acknowledgments:
We are thankful for the financial support of the Eldorado Research Institute that paid the charges involved in the publication of this study.

Conflicts of Interest:
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Parameters of γ D
It is possible to obtain the parameters of the γ D distribution, following the steps below.

Appendix A.1. Expected Value of Each Fading Coefficient
Since the expected value is a linear operator, then Since the channels are independent and identically distributed, E[h SD k ] = 0, and E[e jθ ki ] = α 1 , thus and are the expected values of each Nakagami-m channel. To obtain the variance of the overall channel fading coefficient, the mean of c k = Re{h k } and s k = Im{h k } needs to be calculated, i.e., the in-phase and quadrature components of the fading coefficient, respectively. The in-phase component can be written as while the quadrature component is Then, the expected value of c k is Since E[Re{h SD k }] = 0 and all the summation terms are independent, as well as E[h k ]. In its turn, the expected value of s k is given by Since E[Im{h SD k }] = 0 and all the summation terms are independent, since E[sin θ ki ] = 0.

Appendix A.2. Variance of the In-Phase and Quadrature Components of Each Fading Coefficient
The variance of the in-phase component is written as Since var(h SD k ) = σ 2 SD , the summation terms and Re{h SD k } are independent and the h SD k coefficient is zero mean, then Next, the variance of the term h LD ki h SL i cos θ ki is needed, considering that the variance of the product of two random variables X and Y is var(XY) = var(X)var(Y) + var(X)E[Y] 2 + var(Y)E[X] 2 and the phase noise is independent of the fading magnitudes. Thus, Since h LD ki and h SL i are independents, By using (A16), (A13) can thus be evaluated as which can be rewritten as Therefore, the variance of the in-phase component is On other hand, the variance of the quadrature component of each fading coefficient is given by Considering that var(h SD k ) = σ 2 SD , the summation terms and Im{h SD k } are independent and the term h SD k is zero mean, then Next, for that reason, which can be rewritten as Therefore, the variance of the quadrature component is expressed by All the fading coefficients are independent and identically distributed and and, by using (A19) and (A25), and the mean value of the overall fading coefficient magnitude is given by Therefore, the variance of the sum of the terms Z i is given by var( whose magnitudes are equally distributed. Therefore, var( The covariance can be obtained by where The expected value of the product of two different in-phase coefficients can be written as By expanding the product, where the independent terms can be separated as and where the term E[ h SL m 2 ] = σ 2 SL + µ 2 SL and the variance of the Nakagami-m distributed term is where To obtain the variance according to (A34), the term E[c 2 i c 2 k ] needs to be computed. The terms are approximately correlated Gaussian random variables by the central limit theorem (CLT), for large values of N, therefore where f c i ,c k (x, y) is the joint distribution of the two correlated Gaussian variables c i and c k . The result of the integral is where where E[c i c k ] can be calculated by (A45) in the next page.
Since the moments of c k were previously calculated, ρ c i ,c k can be obtained by In its turn, applying (A46) for the correlation coefficient ρ c i ,c k in the definition of E[c 2 i c 2 k ] in (A43), it is possible to find that (A47).
The term E[c 2 i s 2 k ] depends on two uncorrelated but not independent random variables c i and s k , the correlation is zero and, to calculate the correlation of the squared product, it is necessary to expand the very definition of the two terms as shown in Expanding the product in (A48), it follows that where the general and simplified result given in Ferreira et al. [19] as The in-phase term is and the quadrature term is where The term E[s 2 i s 2 k ] can be easily obtained because s i and s k are uncorrelated and zero mean, then and, by substituting the variances; consequently, (A60).