Intelligent Reﬂective Surfaces for Wireless Networks: An Overview of Applications, Approached Issues, and Open Problems

: An intelligent reﬂective surface (IRS) is a novel and revolutionizing communication technology destined to enable the control of the radio environment. An IRS is a real-time controllable reﬂectarray with a massive number of low-cost passive elements which introduce a phase shift to the incoming signals from the sources before the propagation towards the destination. This technology introduces the notion of a smart propagation environment with the aim of improving the system performance. In this paper, we provide a comprehensive literature overview on IRS technology, including its basic concepts and reconﬁguration, as well as its design aspects and applications for wireless communication systems. We also study the performance metrics and the setups considered in recent publications related to IRS and provide suggestions of future research lines based on still unexplored use cases in the state-of-the-art.


Introduction
Conventional wireless communication systems consist of a transmitter sending information to a receiver via an uncontrollable propagation environment. By considering the growing interest on accomplishing real-time reconfigurable propagation environments for beyond fifth-generation (B5G) technologies and future wireless communication systems, the intelligent reflective surfaces (IRSs) constitute a promising candidate since it will enable us to increase the number of served users and enhance the communication rate [1][2][3][4]. An IRS is a real-time reconfigurable reflectarray deployed to smartly reconfigure the wireless propagation environment through the use of massive low-cost passive elements. IRS technologies are widely labelled in the literature under other names like software-defined hypersurfaces, intelligent walls, software-controlled meta-surfaces, large intelligent surfaces/antennas, and reconfigurable intelligent surfaces.
The implementation of an IRS-assisted system is similar to the use case as for halfduplex relays with the key difference that an IRS implements passive beamforming [5][6][7], i.e., it reflects the signal without amplification and only phase shifts can be introduced to the incoming signals from the base station (BS), so that the power consumption of the IRS will be minimum. In this regard, the IRS-assisted communication can be employed to enhance the performance of conventional wireless communication systems by enabling more degrees of freedom through the control of the wireless channel, thus leading to a more relaxed set of constraints. Moreover, meta-surfaces can control the radio environment with low-noise amplification and do not require either analogue/digital converters, or power amplifiers.

Contributions
In this paper, we present a study of the theoretical basis of the IRS, a comprehensive overview of its most recent applications and an up-to-date review of the papers related to the IRS technology. This analysis includes the performance metrics employed for the system design, their main results and contributions, as well as the different considered setups. We also describe the open problems and main challenges related to this technology. The main contributions of our work are: • A systematic and current organization of the state-of-the-art of the IRS technology considering different backgrounds and interests after an extensive review of the literature. • An analysis and categorization of the considered setups and techniques used for the design of the IRS systems in the existing literature. • An study of the main applications for the IRS technology and the still open problems identified after a comprehensive and thorough review.

Organization
In this paper, we present an overview of the IRS technology. The main aspects related to its theory and design are detailed in Section 2. A review of the approached use cases in the state-of-the-art related to the use of IRSs for wireless networks is performed in Section 3. The main applications for IRS are summarized in Section 4. The practical challenges, the open issues, and the future research lines are analyzed in Section 5. Finally, Section 6 is devoted to the conclusions.

Notations
The following notation is employed: a is a scalar, a is a vector, and A stands for a matrix. Transpose and conjugate transpose of A are represented by A T and A H , respectively. A denotes a set. Finally, a statistical expectation is denoted by E[·].

Theory and Design of IRSs
The hardware implementation of the IRSs is based on the concept of meta-surface, i.e., a controllable two-dimentional meta-material [12,17]. Specifically, the meta-surface consists of a planar array with a large number of meta-atoms with a sub-wavelength electrical thickness (as shown in Figure 1), according to the operating frequency. The meta-atoms are metals or dielectrics able to transform the impinging electromagnetic waves. The properties of these elements and the structural arrangement in the array determine the transformations on the incident waves. The specific physical structure of the meta-surface defines the electromagnetic properties and thus the purpose at the specific frequency [18,19]. The tunable chips inserted in the meta-surface to interact with the scattering element and communicate with an IRS controller are implemented through positive-intrinsic negative (PIN) diodes [20,21], ferro-electric devices, or varactor diodes [22][23][24][25].
Generally, the approaches in the literature related to the design of meta-surfaces are based on Snell's law of reflection: the strongest scattered signal is obtained in the specular direction, i.e., being θ i the angle of the incident wave, the strongest reflected signal is obtained for an angular direction θ s , such that θ s = θ i (as shown in Figure 2). There are a few papers that mainly focus on the theoretical basis, physics and classification of the meta-surfaces [26,27]. In [26], the authors discuss the theoretical basis to characterize metasurfaces and comment on their application as well as how meta-surfaces are distinguished from conventional frequency-selective surfaces. In [27], the physical properties and future applications of the meta-surfaces are analyzed.

IRS's Controller
The IRS's controller is devoted to receiving and communicating the reconfiguration requests, and then distributing the phase shift decisions to all the tunable IRS elements. This controller might be implemented by employing a field-programmable gate array (FPGA) [20,28], a direct current (DC) source [29], or a microcontroller [23,30]. In [31], a meta-surface composed of reconfigurable meta-material strips arranged in a grid has been designed. Specifically, a set of four meta-material strips is configured via a controller switch. The work in [32] also considers a reduced number of controller chips to serve the full IRS array.

IRS-Assisted Wireless Networks
In this section, we briefly describe the signal model of an IRS-assisted system and then describe the use cases approached in the state-of-the-art. For the best comprehension and organization, we divide the scenarios into two sub-problems: the channel estimation and the optimization and design of IRS phase shifts.

Signal Model
In this subsection we introduce the main elements of the IRS-aided systems. Different setups can be considered, including single-IRS-assisted systems, schemes with non line-ofsight (NLOS) between both ends, wideband transmissions, etc. In the following, we present the signal model of two multiuser multiple-input multiple-output (MIMO) IRS-assisted exemplary schemes. Figure 3 shows an exemplary scenario corresponding to the uplink of a single-stream MIMO multiuser multi-IRS-assisted system with K users with N t antennas each. N r antennas are deployed at the BS and L IRSs with N elements each are used to assist the communication. The symbol sent by the k-th user is represented by s k with k = 1, . . . , K. As observed, H ( ) UI k ∈ C N×N t and H UBk ∈ C N r ×N t represent the channel matrix response from the k-th user to the -th IRS and the channel matrix response from the k-th user to the BS, respectively. Each IRS is mathematically modelled as a diagonal matrix Θ ∈ C N×N , which contains complex exponential functions in its diagonal e jθ n , ∀n ∈ {1, . . . , N}, N being the number of passive elements at the IRS. Accordingly, Θ ( ) , = 1, . . . , L, are the matrices containing the phase shifts introduced by the -th IRS. H ( ) IB ∈ C N r ×N is the channel matrix response from the -th IRS to the BS. The channels from the users to the BS are assumed to be available, i.e., there is line-of-sight (LOS) between each user and the BS. The path from the k-th user to the BS through the IRS, i.e., H ( ) UI k and H ( ) IB composes the usually called cascaded channel or reflected path. According to this system model, the received signal at the BS is

Signal Model of Uplink IRS-Aided Systems
where n = [n 1 , n 2 , . . . , n N r ] T represents the complex-valued additive white Gaussian noise (AWGN) which is modeled as n ∼ N C (0, σ 2 n I N r ), whereas p k ∈ C N t ×1 represents the k-th user precoders in the MIMO system.
The estimated user symbolsŝ = [ŝ 1 , . . . ,ŝ K ] T are obtained by linear filtering the signal received at the BS, i.e.,ŝ = W H y, where W ∈ C N r ×K represents the BS filter. Note that this model simplifies when the direct paths between the users and the BS are unavailable, i.e., H UBk , ∀k are blocked and then the LOS term is omitted in (1). Note that the index is used to describe each IRS and it reduces to a single term when considering a system aided witha a single IRS, i.e., L = 1.

Signal Model of Downlink IRS-Assisted Systems
In this system model shown in Figure 4, we consider a typical downlink scenario with LOS propagation and multi-stream transmission such that the BS with N t antennas allocates N s,k streams to be transmitted to each user equipped with N r antennas. The vector of the N s,k symbols transmitted by the BS to the k-th user is s k = s k 1 , s k2 , . . . , s k N s,k T , H BUk ∈ C N r ×N t represents the channel response from the BS to the k-th user, H ( ) IU k ∈ C N r ×N stands for the channel response from the -th IRS to the k-th user, and H ( ) BI ∈ C N×N t is the channel response from the BS to the -th IRS. According to this system model, the received signal by the k-th user reads as where n = [n 1 , n 2 , . . . , n N r ] T represents the complex-valued AWGN and P k ∈ C N t ×N s,k represents the precoder employed by the BS to communicate with the k-th user. At reception, each user obtains its symbols by applying a linear filtering process with W k ∈ C N r ×N k,s , that is,ŝ k = W k y k . Note that the mean square error (MSE)-duality to transform the receive filters in the downlink into the precoders in the uplink can be invoked to describe an uplink scenario through a downlink scenario and vice versa [33,34].

Channel Estimation
The channel estimation is a challenging problem since the selection of the phase-shift matrix at the IRS is based on the acquired channel state information (CSI). The direct transmitter-to-receiver link can be determined by means of traditional channel estimation strategies based on pilot transmission according to a training sequence. However, the estimations for the transmitter-to-IRS and the receiver-to-IRS links are more challenging since the IRS is usually assumed to be a totally passive element, i.e., it is not feasible to enable a pilot transmission from the IRS to perform channel estimation. Furthermore, when considering a large number of meta-atoms, a high-dimensional channel must be estimated, which demands high requirements in terms of computation, power consumption, and time. Some works have considered this problem as a key issue to enabling the IRS technology for wireless communication systems.

THz Systems
In [35], the channel estimation is realized by beam training in NLOS conditions, i.e., there is no direct path between the communication ends. A hierarchical codebook design is provided as the basis of beam training to reduce the estimation complexity in the THz MIMO systems. The authors leverage the narrow beams characteristic of THz systems and employ the hierarchical beam sweeping to find the strongest beam power. The overall channel estimation procedure is performed in a cooperative way between both communication ends. The results are presented in terms of the spectral efficiency (SE) and compare the system performance obtained with the proposed channel estimation algorithm versus that obtained with perfect CSI. The results exhibit a small gap between the imperfect CSI and the perfect CSI approaches.
The work in [36] combines the IRS and the MIMO technologies, and develops a cooperative beam training scheme to facilitate the channel estimation in a multiuser analogdigital hybrid IRS-assisted THz MIMO system. In particular, two different hierarchical codebooks for the proposed training procedure are designed. Based on the training results, the authors also propose two hybrid beamforming designs for both single-user and multiuser scenarios, respectively. In order to reduce the computational complexity of the real-time implementations, an efficient ternary-tree search is proposed for the BS and users, which is proved to be more efficient compared to the binary-tree search. The results are presented in terms of the sum-rate showing a small gap between the system performance with perfect and imperfect CSI when employing the proposed procedure for channel estimation.

OFDM Systems
In [37] a transmission protocol has been proposed to perform channel estimation for an IRS-enhanced single-user single-input single-output (SISO) orthogonal frequency division multiplexing (OFDM) system. A least squares (LS) derivation for the CSI estimation is performed by assuming that the IRS can be divided into multiple sub-surfaces of adjacent strongly correlated reflecting elements. The proposal is a low-complexity alternative based on the estimation of the strongest signal path. The well known Zadoff-Chu sequence [38] is employed as the pilot sequence during the first sub-frame. The method proposed in [37] provides MSE values lower than those obtained with the On/Off-based algorithm in [39]. In [39], two methods have been developed. One is a semidefinite relaxation (SDR) algorithm that achieves close-to-optimal performance but whose complexity is extremely high for practical scenarios. The other is a low complexity On/Off-based channel estimation method which is based on On/Off state control of the IRS reflecting elements by grouping adjacent elements that present high channel correlation. The gains offered by [37] over [39] come from the reflection power loss and noise enhancement present in the On/Off-based channel estimation method in [39]. The work in [40] considers a three-phase pilot-based channel estimation framework to acquire the overall uplink channels in an IRS-assisted single-input multiple-output (SIMO) system by considering the correlation among the user-IRS-BS reflected channels of the different users (cf. [40], Table 1). The results are presented in terms of the MSE and the proposed method provides a higher performance compared to that of the benchmark scheme that neglects the channel correlation.
A practical transmission protocol for channel estimation is proposed in [39] for an OFDM SISO system under frequency selective channels with a single cell-edge user and LOS. The authors also employ the technique of flexibly grouping the IRS elements with On/Off states and estimating the combined channel per group which also reduces training overhead and estimation complexity. The results are presented in terms of the achievable rate and compare the system performance under the utilization of the proposed channel estimation technique versus that of the scheme with perfect CSI. The performance loss obtained in the numerical experiments is small.
In [41], the authors consider a single-user IRS-aided OFDM multiple-input singleoutput (MISO) system under LOS conditions and propose a deep learning (DL) based channel estimation method. A convolutional neural network (CNN) is designed to estimate both the direct and cascaded channels of the system. The authors propose a single neural network (NN) for first estimating the combined channel and then recovering individual channel estimates for complexity reductions in the training process. Simulation results show that the proposed DL approach achieves a performance better than the On/Off patterns in [39] because when estimating the direct link with all the IRS elements turned off, the corresponding effective channel gain for pilot transmission is smaller and this leads to higher estimation errors. In [42] the authors formulate two low-complexity LS-based estimation methods for frequency-selective fading channels in the uplink of multiuser IRS-aided OFDM SIMO systems. The first strategy is applicable for arbitrary frequencyselective fading channels. The second channel estimation scheme is especially suited for LOS scenarios. The authors also optimize the training designs (pilot tone allocations and IRS time-varying reflection patterns) for each method to minimize the channel estimation error. Comparisons are performed w.r.t. benchmark scenarios regarding pilot tone allocation, which offer a lower performance in the channel estimation process.

MmWave Systems
In [43] a compressed-sensing approach is considered to estimate the cascaded BS-IRSuser channel in a MISO mmWave downlink system with NLOS conditions by addressing the channel estimation as a sparse signal recovery problem. The results are presented in terms of the normalized mean-square error (NMSE) and the method proposed shows higher performance than that obtained in [44]. Thus, the LS estimator in [44] requires many more measurements to achieve a performance similar to that of the method proposed in [43], which results in a more computationally efficient estimator.
In [45] a compressive sensing (CS)-based channel estimation solution for the downlink of a multiuser OFDM IRS-aided hybrid mmWave MISO cellular system is proposed. The angular channel sparsity caused by large-scale arrays present in mmWave is exploited to perform the CSI acquisition with reduced pilot overhead. The authors assume that the BS-IRS channel is known and then employ a distributed orthogonal matching pursuit (OMP) algorithm for the IRS-users channel estimation. The results are presented in terms of the NMSE and the algorithm proposed is compared with that developed in [44]. The results show that the algorithm proposed leads to a NMSE lower than that obtained in [44] with the LS algorithm since the performance of this latter method is limited by the interference corresponding to the NLOS paths.
In [46], the high-resolution channel estimation problem for IRS-assisted mmWave MIMO systems was studied. Again, the cascaded BS-IRS-user channel estimation is formulated as a sparse recovery problem. A matching pursuit (MP)-based high-resolution estimation is performed, which outperforms the scheme in [35] in terms of the NMSE. The computational complexity of this algorithm depends on the dictionary matrix which leads to a prohibitive complexity in practical implementations. Nevertheless, the authors propose to exploit the coarse angular domain information obtained by beam training in order to reduce the size of the dictionary matrix and thus the complexity.
The work in [47] also considers a DL framework for channel estimation in a multiuser IRS-assisted mmWave MISO systems with non-ideal switching. A twin CNN architecture is designed and it is fed with the received pilot signals to estimate both the direct and the BS-IRS-users cascaded channels by considering LOS conditions between the users and the BS. The user has a deep network that is fed by the received pilot signals from the BS in the downlink to estimate the direct and the cascaded channels. The results show lower minimum mean square error (MMSE) with respect to the approach in [37].

Wideband Systems under Beam Squint
The authors in [48] consider both the channel's frequency selectivity and the effect of beam squint to develop a twin-stage orthogonal matching pursuit (TS-OMP) CSI estimation algorithm in a single-user SISO wideband OFDM system by considering NLOS between both ends, i.e., only the cascaded BS-IRS-user channel is estimated in the downlink. The authors demonstrate that the mutual correlation function between the spatial steering vectors and the cascaded channel presents two peaks, which leads to a pair of estimated angles for a single propagation path due to the beam squint effect. One of those angles constitutes the frequency-independent angle while the other one is frequency-dependent and inserts false angle estimations. In order to reduce the influence of false angles on the CSI acquisition, the TS-OMP algorithm obtains the path angles of the cascaded channel in the first stage, while the propagation gains and delays are obtained in a second stage. A bespoke pilot design that exploits the characteristics of the mutual correlation function and the cross-entropy theory is also proposed to obtain a CSI estimation with improved performance. The NMSE is the metric employed to evaluate the system performance and comparisons are made with a benchmark approach that uses a high number of pilots. Such benchmark approach only provides a small performance gain over the low-complexity TS-OMP algorithm.

Sub-6 GHz Narrow Band Systems
In [49] the authors consider the downlink of a multiuser IRS-assisted multi-user MISO system with NLOS conditions and develop two methods based on the PARAllel FACtor (PARAFAC) modelling to unfold the BS-IRS-user cascaded channel model. The proposed method ( [49], Algorithm 1) includes an alternating least squares procedure to iteratively estimate both the channel between the BS and the IRS, as well as the channels between the IRS and the users. The results are presented in terms of the NMSE and the algorithms are compared between them. The main limitation of these algorithms is the high computational complexity which makes them not suitable for practical implementations.
In [44], another LS approach is considered to estimate the uplink cascade channels in a point-to-point MISO system with a multi-antenna BS and an energy harvesting (EH) user. The results are presented in terms of the energy harvested by the user and comparing the system performance with the proposed method versus that obtained with perfect CSI, showing a small gap between them. The cascade channel usually has a large size and this method may incur a considerable amount of training overhead by neglecting the sparse structure of the mmWave wireless channels.

Optimization and Design of IRSs
The design of IRSs is a novel and challenging issue under the signal processing point of view. However, some recent works can be found in the literature that consider different setups and different metrics to design the IRS phase-shift matrix. We group the analyzed works according to the configuration of the communication setups, i.e., the number of transmit/receive antennas, and we also distinguish between single and multiuser scenarios. Within each group, the works address different communication scenarios with different characteristics, such as narrowband or wideband transmissions, LOS or NLOS links, perfect or imperfect CSI, and they also consider different metrics to optimize the overall system performance.

Single-User SISO Scenario
In [6] a single-user SISO scheme with LOS conditions has been considered. The authors proved that the optimal IRS phase-shift setting is the setup that aligns the reflected rays to the direct path between the BS and the users. The paper also compares the IRS technology with conventional decode-and-forward (DF) relaying showing that the IRS needs hundreds of reconfigurable elements to reach a behaviour similar to that of the DF relaying. Additionally, the IRS achieves higher energy efficiency (EE) than DF relaying if very high rates are needed. By considering a simple LOS scenario where the IRS is deployed between a BS and the user to assist the communication, the optimized selection of the phase of each discrete element in the IRS leads to phase alignment of the direct path (BS to user) and scattered paths (BS to IRS to users).
In [50], a point-to-point SISO system assisted with multiple IRSs is considered. The authors present multi-layer perceptron NN architectures that can be trained either with positioning values or the channel coefficients. Both centralized and individual training of the IRSs are proposed. The simulation results show that achievable rates close to the optimum scheme can be achieved.
Some works consider single-user point-to-point OFDM systems. The authors in [51] approach the optimization of the IRS phase-shift matrix in a single-user SISO OFDM system with blockage of the direct paths between both ends, i.e., NLOS. Specifically, the authors develop a deep reinforcement learning (DRL)-based framework-which requires minimal training overhead-to solve a non-convex optimization problem by considering few active elements at the IRS and imperfect CSI. The results show that the proposed DRL-based framework is near optimum in terms of rate and converge to the solution with perfect CSI. A single-user SISO OFDM IRS-assisted system is considered in [52], where a ML solution to design the IRS phase-shift matrix is employed. The solution exploits DL tools to learn how to predict the proper IRS matrix configuration directly from the sampled channel knowledge. The simulation results show that the proposed solution can achieve near-optimal data rates with negligible training overhead, without any knowledge of the IRS geometry.
The authors in [53] also consider a system with imperfect CSI. Specifically, a pointto-point SISO with LOS conditions between both ends is assumed. Two solutions for the design of the IRS phase-shift matrix are proposed. In the first approach, the authors employ compressive sensing strategy to construct the channels by considering a few active elements at the IRS. In the second approach, a DL-based solution where the IRS learns how to interact with the incoming signals given the channels coefficients at the active elements. The achievable rates assessed in the proposed solutions approach the upper bound with perfect CSI, with minor training overhead.
The authors in [54] consider an uplink single-user cellular network and derive an approximation of the achievable data rate by considering a practical IRS implemented via phase shifters with a limited resolution, i.e., discrete phase shifts. A derivation of the required number of phase shifts under a data rate degradation constraint is performed. Table 1 summarizes the references related to single-user SISO systems, their system model assumptions, the performance metrics considered for the optimization, and the main results obtained.

Multiuser SISO Scenarios
In [55] the authors consider a multiuser SISO IRS-assisted system in a LOS scenario and define the received signal-to-interference-plus-noise ratio (SINR) as the metric to be optimized at the receiver by considering that the impinging signals upon an IRS are often mixed with interfering signals. To tackle this issue, the concept of intelligent spectrum learning (ISL) is introduced, which uses an offline trained CNN at the IRS controller to infer the interfering signals directly from the user signals. A distributed control algorithm is proposed to maximize the received SINR by configuring an active/inactive binary status of the IRS elements. Simulation results exhibit a performance improvement by employing this deep learning approach.
In [56], the authors aim at minimizing the MSE in a multiuser SISO multi-IRS assisted federated learning system. LOS conditions are assumed and a joint optimization of the user power allocation and the IRS phase-shift matrices via an alternating optimization (AO) algorithm is performed. The IRSs are considered to have a binary status, i.e., On/Off state. Simulation results show that the proposed algorithms lead to improving the convergence and accuracy of the federated learning approach and offer gains over baselines strategies without IRSs. Table 2 summarizes both multiuser SISO approaches [55,56], and their assumptions in the system model, as well as the performance metrics and the main results. The authors in [57] propose a suboptimal semidefinite relaxation algorithm to maximize the total received signal power by a single-user communicating with a multi-antenna access point (AP) in LOS conditions. The simulation results exhibit performance gains over baselines (e.g., systems without IRS) in terms of the received signal-to-noise ratio (SNR).
In [58], a DL-based approach for the design of the IRS phase-shift matrix in a singleuser MISO system is proposed. Specifically, a customized deep NN is trained offline by using the unsupervised learning mechanism, which is able to make real-time predictions when deploying online. The simulation results show that the proposed approach offers slightly lower system performance than the semidefinite relaxation based approaches in [57] in terms of rate while it significantly reduces the computation complexity.
In [59], a DL approach has been developed to tune the IRS phase-shift matrix in realtime by considering a MISO system with LOS, i.e., a direct link is available between both ends. Simulation results show that the DL approach leads to performance comparable to that of conventional approaches while significantly reducing the computational complexity.
In [60], a DL approach which learns and makes use of the local propagation environment is deployed for the configuration of the IRS phase-shift matrix in a downlink single-user MISO system with LOS conditions and imperfect CSI. The proposed method uses the received pilot signals reflected through the IRS to train the network. The performance of the proposed approach is evaluated in terms of the NMSE between the phases obtained by the proposed algorithm with imperfect CSI and the optimal IRS phases based on perfect CSI. Table 3 summarizes the main results, system model assumptions, and evaluation performance metrics of the papers analysed in this subsection.

Multiuser MISO/SIMO Scenarios
In [24] an IRS-assisted multiuser MISO system with LOS between a common AP and the users is considered. In particular, a practical phase shift model that captures the phase-dependent amplitude variation in the element-wise reflection design at the IRS is proposed. An optimization problem to minimize the total transmit power at the AP by jointly designing the AP transmit precoder and the IRS phase-shift matrix subject to the users' individual SINR constraints is stated. Then, an AO algorithm is proposed to find suboptimal solutions. The simulation results show the asymptotic performance loss of the optimized IRS-assisted system when it is implemented with practical phase shifters and the obtained gains by optimizing with the practical model under consideration. The work in [61] also considers the practical limitations of the IRSs. In particular, the authors have studied the asymptotic achievable rate in an IRS-assisted downlink MISO system. The system model considers a common BS transmitting to K users by employing time-division multiple access (TDMA) to serve one user at each time slot. Multiple IRSs are deployed in a scheme where some users are IRS-aided and the others are directly served by the BS. The optimality of the asymptotic performance has been analyzed by considering the practical limitations of the IRSs, e.g., the practical reflection coefficients. In this context, an IRS phase-shift matrix design is proposed, which is able to asymptotically achieve the optimal performance. The results are presented in terms of the symbol-error-rate (SER) and the ergodic rate and compared with a baseline strategy without IRS and the AO algorithm proposed in [24].
The authors in [62] consider a multiuser IRS-assisted SIMO system, where the uplink transmissions are reflected by the IRS. An approximated zero-forcing (ZF) equalizer is implemented to cancel the inter-user interference in a decentralized way by optimizing the IRS phase shifters (see [62], Algorithms 1 and 2). The results are presented in terms of the signal-to-interference ratio (SIR) and show better performance than a single daisy chain realization of the algorithm in [63] to compute the equalization vector.
Some works related to non-orthogonal multiple access (NOMA) IRS-aided systems have been considered. In [64], an IRS-aided multiuser MISO NOMA downlink transmission framework is proposed. The authors assume NLOS between the users and the BS. An optimization problem to maximize the sum rate is formulated. For adjusting the IRS phaseshift matrix, a deep deterministic policy gradient (DDPG) algorithm, which dynamically learns the resource allocation policy is defined. The proposed framework achieves a sum rate larger than the one assessed with traditional orthogonal multiple access (OMA) networks.
In [65], an IRS-assisted NOMA MISO downlink system is considered. The authors consider the joint optimization of the beamforming vectors at the BS and the phase-shift matrix at the IRS to minimize the total transmission power consumption at the BS. The problem is solved via an AO minimization approach and the results exhibit that this solution leads to transmission power reductions by comparing with a baseline strategy, which imposes random phase shifts at the IRS.
In [66], the authors consider the uplink of a cluster-based multiuser code-domain NOMA IRS-assisted MISO system. It is considered a setup where each IRS is covering a cluster of users, which have NLOS conditions with the BS. The optimization of the IRS phase-shift matrices such that a large number of users are correctly detected is approached. To overcome the coupling between the IRS phase shifts and other variables, such as the detection order and the filter, a sum-rate optimization is used to obtain a decoupled estimate of those variables. Then, the final IRS adjustment is performed via an semidefinite programming (SDP) relaxation of the optimization problem. Simulation results show the performance gains obtained (measured in terms of the number of correctly detected users) with respect to a random phase shift implementation when the IRS phases are properly optimized.
In [67], the authors consider a multiuser MISO IRS-assisted NOMA downlink system with NLOS conditions. The IRS-assisted NOMA system is designed to ensure that additional cell-edge users can be served by the BS. Both analytical and simulation results are provided to show the performance of the proposed scheme in terms of the outage probability. Hardware impairments are also analyzed.
In [68], the authors address a multiuser MISO IRS-assisted mmWave communication system. The motivation is to enhance the network reliability and connectivity in the presence of random blockages in mmWave, which usually implies NLOS. A stochastic optimization problem based on the minimization of the sum outage probability is formulated, and a stochastic-learning-based robust beamforming design is proposed via a gradient descent algorithm. The numerical results validate the performance benefits of the proposed algorithm in terms of outage probability and effective data rate over baseline strategies (non-IRS, random IRS).
In [69], a DRL-based algorithm for the configuration of the IRS and the BS beamformer is developed in a downlink IRS-assisted multiuser MISO system. The proposed DRL-based algorithm obtains the design of the BS precoder and the IRS phase-shift matrix as the output of the DRL NN from the instantaneous channel estate information. Simulation results show that the algorithm is not only able to learn from the environment and improve its behaviour but also obtains comparable but lower performance than [70]. The work in [70] proposes two EE maximization algorithms for the BS transmit power allocation and the IRS phase-shift matrix in a downlink multiuser MISO system with NLOS. The results are presented in terms of EE and show that the proposed solution offers higher EE than the Amplify-and-Forward (AF) relay-assisted setup.
The authors in [71] present an infrastructure to perform ML tasks at a mobile edge computing (MEC) server with the assistance of an IRS in a multiuser MISO system. Therein, they aim at maximizing the learning performance. Specifically, a minimization of the maximum learning error (MLE) of all participating users is performed by jointly optimizing the transmit power of mobile users, the beamforming vectors of the BS, and the phaseshift matrix of the IRS. An AO-based framework is proposed to optimize the three terms iteratively, where a successive convex approximation (SCA)-based algorithm is proposed to solve the power allocation problem, closed-form expressions are derived to solve the beamforming design problem, and an alternating direction method of multipliers (ADMM)based algorithm is designed to efficiently solve the phase-shift matrix design problem. Simulation results show significant gains when deploying the IRS over baseline strategies.
Several works related to the IRS phase-shift matrix design assume imperfect CSI. In [72,73] the authors propose a ML approach to optimize both the beamformers at the BS and the IRS phase-shift matrix in a multiuser MISO system. Such an approach employs a deep NN to parametrize the mapping from the received pilots to an optimized system configuration, then a graph neural network (GNN) architecture is considered to capture the interactions among the different users in the cellular network. Furthermore, the authors propose an implicit channel estimation, which is generalizable and leads to an efficient learning to maximize the sum rate by considering a small number of pilots.
In [74] a downlink multiuser IRS-assisted MISO system is considered. Both perfect and imperfect CSI scenarios are addressed under the assumption of LOS between the common AP and the users. For the perfect CSI setup, an algorithm is proposed to maximize the weighted sum rate of the system by utilizing the fractional programming technique. This algorithm is then extended for the imperfect CSI setup. The simulation results are presented in terms of the weighted sum rate and show the performance gain offered by the proposed algorithms over benchmark strategies without IRS. However, these algorithms present high computational complexity.
In [75], a multiuser MISO multi-IRS-assisted system with unavailable direct link conditions between the users and the BS (NLOS conditions). The IRS phase-shift matrices are configured by exploiting the statistical CSI in an imperfect CSI scenario. In particular, two ML algorithms have been proposed, which learn the statistical CSI from the historical channel observations. Numerical results show the largest gains in terms of rate when the channel randomness is low.
In [76] the authors investigate the IRS design under imperfect cascaded BS-IRS-user CSI in a multiuser MISO system with LOS between both ends. They aim to minimize the transmit power subject to the rate outage probability constraints by considering a statistical CSI error model. The minimization problem is reformulated by following a Bernstein-type inequality and has been solved via an AO framework.
In [77], a cognitive radio (CR) MISO downlink IRS-system is considered. Specifically, a single secondary user is coexisting with multiple primary users and multiple IRSs are deployed to enhance the EE and the SE of the system. The authors aim to maximize the achievable rate of the secondary user subject to a total transmit power constraint by considering imperfect CSI. Simulation results show the system performance improvement by including the IRS technology in the CR network.
The authors in [78] propose a DL method for online reconfiguration of the IRS in a SIMO complex indoor environment. The IRS phase-shift matrix configuration is set according to the receiver position to maximize the rate. The simulation results are presented in terms of the rate and the MSE between the output of the NN (the IRS phase-shift matrix) and the optimal phase-shift matrix for a given user position.
The authors in [79] consider an IRS-aided multiuser MISO downlink system, where the transmit beamforming and the IRS phase-shift matrix are jointly designed to maximize the system sum rate. The solution is a DL-based approach to perform the joint design.
Specifically, a two-stage NN is implemented and trained offline in an unsupervised manner, and it is then deployed online for real-time predictions. Simulation results exhibit substantial reductions in the computational complexity with satisfactory performance compared to conventional iterative optimization algorithms as in [74] (see [79], Table 1).
The work in [80] considers the use of an IRS to enhance anti-jamming communication performance and mitigate jamming interference by properly adjusting the IRS elements in a multiuser MISO system under NLOS conditions between the users and the BS. The problem formulation is performed as a joint optimization of the power allocation at the BS and the IRS phase-shift matrix while considering quality of service (QoS) requirements for legitimate users. A fuzzy win or learn fast-policy hill-climbing (FWoL-FPHC) approach is proposed to jointly optimize the anti-jamming power allocation and the IRS phase-shift matrix. Simulation results show that the proposed anti-jamming learning-based approach can efficiently improve both the IRS-assisted system performance (measured in terms of rate) and the transmission protection level compared with the mmWave massive MIMO approach presented in [81].
In [82] a downlink multiuser IRS-aided MISO system is considered, where the IRS only presents a finite number of phase shifts at each element to assist the communication. A transmit power minimization approach is developed by jointly optimizing the transmit precoding at the BS and the discrete phase shifts at the IRS, subject to a given set of minimum SINR constraints at the user receivers. To solve the problem, the authors first study the case where only one user is assisted by the IRS and propose both optimal and suboptimal algorithms for solving it. It is also shown that the IRS with discrete phase shifts achieves the same squared power gain in terms of an asymptotically large number of reflecting elements, while a constant proportional power loss is incurred depending on the number of phase-shift quantization levels. The proposed designs for the single-user case are also extended for the multiuser scenario where some of them are aided by the IRS. In the simulation results, comparisons with several benchmark schemes regarding the quantization in the IRS elements are performed.
The authors in [83] consider the passive beamforming and information transfer (PBIT) technique for an uplink multiuser IRS-assisted SIMO systems. The IRS is assumed to follow a passive beamforming (phase shifting) and an on-off reflecting modulation. The authors formulate the problem as a two-step stochastic problem and aim to maximize the achievable user sum rate of the system. An sample average approximation (SAA) based iterative algorithm and a simplified algorithm by approximating the stochastic program as a deterministic AO problem is proposed for the IRS phase-shift matrix design. The solutions are extensible for multi-IRS system models. The simulation results exhibit the large gains offered by the proposed approach over a random phase-shift strategy at the IRS. Table 4 summarizes the review about multiuser IRS-assisted MISO/SIMO approaches. The considered system models, the metrics to evaluate the performance, and the main results are included.

Single-User MIMO Scenarios
The authors in [84] propose different schemes to enhance the composite channel (including the cascaded channel and the direct path) power resulting in an improvement for the achievable rate of a single-user MIMO system with LOS. It is also shown that the incorporation of the IRSs in LOS environments increases the rank of the channel matrix, and improves the achievable rate by enabling spatial multiplexity.
The authors in [85] focus their attention on achieving the capacity gain of a point-topoint MIMO system with LOS by optimizing the IRS phase matrix to improve the rank of the channel matrix. Table 5 summarizes the system models, the metrics assumed to evaluate the performance, and the main results of [84,85].

Multiuser MIMO Scenarios
In [86] the authors address the IRS phase-shift matrix design by considering a low resolution intelligent surface assisting a multiuser system. The IRS is used as a transmitter, which modulates the incoming symbols by adjusting the phases of the reflective elements. The authors consider both the MISO and the MIMO setup, and aim to design the symbol-level IRS modulator to minimize the SER in the SIMO configuration by relaxing the low-resolution phase-shift constraint and solving the problem via a Riemannian conjugate gradient (RCG) algorithm. For the MIMO setup, the symbol-level IRS modulator is iteratively designed by decomposing the original large-scale optimization problem into several sub-problems. Simulation results are presented in terms of the SER by varying the quantization level resolution available to deploy the IRS.
The authors in [87] study the tradeoff between EE and SE in the uplink of a multiuser MIMO system aided by an IRS equipped with discrete phase shifters. The transmission strategy design is based on the partial CSI of the cascaded channels by assuming NLOS between both ends. The precoding design at the users and the IRS phase-shift matrix configuration are approached to maximize a metric called resource efficiency (RE)-which is flexible to balance the EE and the SE [88]-for both continuous and discrete-phase shifts at the IRS. An optimization framework by leveraging the AO is proposed to face the optimization problem by developing an iterative MMSE method combined with a projected gradient (PG). The simulation results show the efficiency of the developed optimization framework for RE maximization.
The work in [89] explores optimization-based and data-driven solutions in a multiuser MISO IRS-aided MEC system in a LOS scenario with a multitask AP which is able to perform the joint optimization of the IRS phase-shift matrix and the AP's receive beamforming vectors. A three-step block coordinate descending (BCD) algorithm is first proposed to solve the non-convex maximization problem. In order to reduce the computational complexity, two DL architectures are constructed. The CSI is the input in the first learning architecture, while the second one exploits the users' locations. The two data-driven approaches are trained using data samples generated by the BCD algorithm via supervised learning. The simulation results show a close match between the performance of the optimization-based BCD algorithm and the low-complexity learning-based architectures in terms of the total completed task-input bits at the users in a given time slot. Table 6 summarizes the main system model assumptions, the metrics to evaluate the performance, and the main results of the reviewed single-user IRS-assisted MIMO systems.

IRS Applications
By considering the tremendous amount of benefits offered by the IRSs, several application scenarios can be examined by considering this technology. Figure 5 shows several applications which could benefit from the use of IRS technology. The first illustrated application is related to scenarios where some user is totally blocked and not available to communicate with a BS, (e.g., [75]). In this case, the IRS could be employed to create a feasible connection, i.e., a virtual link between both ends. This is particularly useful for problems of coverage extension in mmWave and THz communications systems due to the unfavourable free-space omnidirectional path loss in these frequency bands. The second presented application in Figure 5 constitutes a scenario with legitimate users and an eavesdropper device. In this case, the IRS could be employed to mitigate the effect of the eavesdropper by properly cancelling its signal and thus increasing the communication system security. In [80] the authors employ the IRS technology to increase the communication system security by developing a strategy for jamming interference mitigation. In the third scenario illustrated in Figure 5, as it is considered in [39,67], a cellular network with a cell edge user-which can suffer high signal attenuation from the BS and co-channel interference from near BSs-is considered. In this scenario, the IRS deployment could be helpful to increase the coverage area in the cellular network. The fourth application focuses on device-to-device (D2D) networks [90], where the IRS could be used to cancel the interference, support the required low-power transmission in these systems, and enhance individual data links in these communication systems. The fifth illustrated application in Figure 5 is the use of IRSs for the mmWave band, where a high sensitivity to blockage is present in the propagation [91]. The IRSs are useful in these scenarios to increase the received power, the channel rank, and thus the spatial diversity needed for outdoor systems. The sixth application shown in Figure 5 is the use of IRSs for Internet of Things (IoT) systems (e.g., [61]), where a multi-IRS scenario seems to be useful to compensate the power loss over long distances and alleviate the energy budget issue in energy-constrained IoT networks via the passive IRS beamforming. The use of the IRS is also interesting for B5G-NOMA systems (e.g., [66,67]). In these scenarios, the IRS could be considered to increase the number of served users and enhance the rate of communication, which constitutes the major requirement to be accomplished in these systems. Another interesting application illustrated in Figure 5 is the use of the IRS technology for CR networks (see [77]), where the IRS could be used to increase the degrees of freedom to further improve the CR network performance. Specifically, the IRS could be employed to improve the efficiency of the secondary transmissions.  Figure 5. IRS applications.
The use of IRSs is beneficial for MEC systems by enhancing both the EE and the SE. For example, in [71] the transmit power of the mobile users is minimized by considering an infrastructure to perform ML tasks in the MEC server. The use of IRSs is also interesting for unmanned aerial vehicle (UAV) networks, where the IRS is used to enhance the quality of the communication between the UAV and the ground users, thus being instrumental for the optimization of the UAV trajectory and the system performance (cf. [3]). The last application of the IRS technology illustrated in Figure 5 is related to the power transfer systems, where the IRS phase-shift matrix is designed to enhance the received signal strength at the energy receivers in the charging zone with the aim of ensuring the energy harvesting requirements [83].

Practical Challenges, Open Issues and Future Research Lines
In this section, we analyze the main challenges for the practical deployment of the IRS technology and the potential techniques which can be utilized to address them. We also discuss some open issues and the future research lines for wireless communication IRS-assisted systems.

Practical Challenges
It is evident that the IRS technology leads to enhancements of the wireless system performance and that is why this topic has received special attention in the literature.
However, there are still some challenges for the practical implementation of IRS-assisted systems which also require attention.

Channel State Acquisition
The IRSs constitute a large array of reflective passive elements, i.e., a passive communication element. The enhancements provided by IRS-aided systems relies on the proper configuration of the IRS phase-shift matrix according to the knowledge of the channel conditions. The channel state acquisition, by considering an IRS as a totally passive element, is usually performed at the communication ends, i.e., the transmitter or the receiver. This leads to degradation in the accuracy of channel estimation, and high computation and power requirements when considering large IRSs. A practical and efficient channel estimation strategy is still one of the key issues to enable IRS technologies. A possible solution may be the use of low-power sensors -which can be powered by energy harvested modules-embedded through the IRS.

Implementation and Testbeds
In the last years, a few promising testbeds and experimental activities have been reported [92], and some industrial developments have also been reported [10]. However, these reports are usually difficult to find and insufficient to practically judge the IRS technology.

Practical IRS Deployment
The deployment of the IRSs in a wireless network is another challenging issue. From the optimization viewpoint, when considering a single BS serving a single user, the IRS should be deployed to guarantee clear LOS from the BS with the aim of maximizing the received signal power. However, such a straightforward deployment strategy may not be suitable in multi-user scenarios since one single LOS path between the IRS and the common BS leads to a low-rank channel which will limit the spatial multiplexing for the transmission to multiple users through the IRS [4,57]. In this case, it is preferable to guarantee a high channel rank and a strong LOS within the BS-IRS link. The IRS deployment also needs to consider the spatial users' density, the inter-cell interference in cellular networks, etc. Therefore, autonomous deployments for IRS-assisted systems is still an open challenge that can be faced with DL and reinforcement learning (RL) techniques by training a NN with some reference allocations and collecting key performance indicators [2].

Design of Model-Driven System
The IRS-assisted systems are more complex to model and design than conventional wireless networks. Due to this higher system complexity, the use of data-driven methods based on DL, RL, and transfer learning constitute promising techniques [2,93]. Table 7 summarizes the main analyzed challenges and the possible techniques to face to them.

Open Issues and Future Research Directions
Different approaches examined in this review evidently indicate that the IRS technology is receiving special interest in the current research due to its promises of accomplishing some advantages such extend the coverage in cellular networks, minimize the transmit power, enable better security, suppress interference, etc. However, there still exists some challenges, open issues, and new research directions that will be receiving continuous interest.

Near-Field Region and Spherical Wave-Front Model
Most papers about IRSs assume the far-field operation region. However, the combination of large-scale antennas with high frequencies often results in communications in the near-field region. The operation of IRS-assisted systems in this region and the design particularities by considering the spherical wave-front model have not been thoroughly studied in the literature.
The authors in [94] provide a detailed analysis on the design and use of IRSs by considering both the near-field and far-field radiation characteristics of such surfaces, diverging from the commonly adopted far-field implementation of these apertures in the literature. Some important issues are described in this work which can be considered for design particularities and even for user scheduling in communication systems.

Hybrid Transceivers for mmWave IRS-Assisted Systems
To reduce the high power consumption and the radio frequency (RF) cost in mmWave MIMO systems, the hybrid analogue-digital architectures have been proposed. The precoder/combiner design in these schemes usually leads to a constrained optimization problems. The IRS technology is well-suited for the mmWave band due to the promising advantages offered in terms of increasing the spatial diversity. The joint optimization of the hybrid precoder/combiner for mmWave MIMO and the IRS phase shift-matrix-which can be seen as an analogue precoder/combiner-constitutes a challenging open problem.
In [95], an IRS-assisted downlink mmWave MIMO single-user and multi-stream system with hybrid precoding/combining and NLOS conditions has been approached. The authors focus on maximizing the SE by jointly optimizing the IRS phase-shift matrix and the hybrid transceiver. A manifold optimization-based algorithm is developed by exploiting the structure of the mmWave channel. Although this work is interesting, more complex scenarios can be addressed by considering multiuser and wideband scenarios where the transceiver design is more challenging since the analogue part is frequency flat and common to all the users, not to mention the impact of the beam squint effect present in wideband systems.

User Balancing and Scheduling
Appropriate user scheduling results are decisive to perform practical implementations which balance the number of served users according to the power transmission constraints and/or the QoS. The schemes of user balancing in IRS-assisted systems have not been approached in-depth treatment. Limited works can be found regarding user scheduling in IRS-assisted systems, e.g., the authors in [66] analyze the optimization of the IRS phaseshift matrix by considering the detection order of the received users by a common BS in a NOMA MIMO system. These ideas can be extended to mobile networks by balancing the amount of served users and the overall system performance as it has already been considered in conventional non-IRS systems, e.g., ( [96,97], Algorithms 1-3). The use of the IRS technology together with an efficient scheduling strategy constitutes a novel and challenging open research line.

Wideband Systems
Wideband IRS-aided systems have not received special attention. Most papers about IRSs assume narrow band model and these designs cannot be directly extended to the wideband scenario. The reason behind this claim is that the phase shifters in the IRS phase-shift matrix are frequency flat, and therefore, have to be jointly designed for all the subcarriers. Moreover, the large channel bandwidth leads to the beam squint effect, which in turn leads to more constrained designs. The work in [48] considers a wideband IRS-aided communication system under the beam squint effect. However, the work focuses on the channel acquisition problem and not on the IRS design for these systems. The design of IRS-aided wideband systems is certainly a challenging open issue suggested for future research.

Conclusions
This paper is a comprehensive survey on IRS technology that addresses the main theoretical aspects of the IRSs, the main approached issues in the state-of-the-art, the applications of IRSs for future wireless communication systems, and the open issues to be approached in the future. First, we have introduced the meta-surface concept, its properties, and some specific hardware parameters to be considered in the design of the IRSs. Next, we have addressed the IRS controller by describing its functionalities and some practical implementations. A comprehensive survey on the approached scenarios by considering the IRS technology for wireless communication systems has been performed. The corresponding system models and their particularities such as multi-antenna setups, the radio propagation conditions (NLOS and LOS), and the hardware impairments have been analyzed. The metrics considered to evaluate the system performance and the main results of relevant works in the literature under different communication system scenarios have been summarized. A vast amount of techniques to handle the problems of the IRS design and the channel estimation in these setups have been analyzed, e.g., DL, AO, and PG. Then, some applications where the IRS technology could result beneficial have been described, including the communication blockage -which is common in the mmWave band-the security, the benefits in mobile communications by improving the service quality for cell edge users, the D2D communications where the IRS-assisted systems suit with the required low-power transmission, the IoT systems, where the IRS-aided systems could be employed to alleviate the energy budget issues, the B5G NOMA systems, where the IRS deployment results in benefits to maximize the rate, the CR system where the EE of the secondary transmissions can be improved by considering IRS-aided systems, the MEC systems, where the use of IRSs enhances the EE and the SE, and the UAV systems, which also benefit of the IRS technology and the power transfer systems. Finally, the practical challenges and the identified open problems for the design of the IRS-aided wireless systems such as the joint design of the IRS matrix and the hybrid transceivers in mmWave, the user scheduling in practical systems, and the special considerations under wideband scenarios such as the beam squint effect have been suggested.
After reviewing a large number of references in the literature related to the IRS technology, it is clear that this technology represents a very promising strategy to be incorporated in a countless number of wireless communication scenarios. The idea of controlling the propagation environment by using arrays of passive elements with negligible power consumption is extremely attractive to reduce the economic cost and the environmental impact of implementing the future B5G architectures, where the requirements in terms of SE and user accommodation will be extremely challenging. However, the practical deployment of this technology is still in an early stage due to several issues which make it difficult, such as hardware impairments, the difficulty of acquiring accurate channel information, or the need of optimizing the positioning of IRSs depending on the communications scenario. For this reason, it is nowadays complicated to find real implementations of IRS-aided wireless systems in the industry field, and even testbed prototypes that allow evaluating the practical feasibility of this technology. In this regard, it is clear that more research and technical advances are necessary to address multiple open issues with the aim of moving to a true consolidation of IRS-aided communication systems in the future. Hence, though significant performance improvement can be identified in the literature, IRS-aided systems are expected to provide higher performance gains over the current state-of-the-art approaches by addressing a significant number of practical issues regarding mobility, user scheduling, coding, etc. The topic of IRS is called to receive greater attention from the wireless communication research community.
Author Contributions: D.P.-A. participated in the tasks: conceptualization, formal analysis, investigation, visualization, and writing-original draft. Ó.F. participated in the tasks: conceptualization, formal analysis, investigation, visualization, and writing-original draft. J.P.G.-C. participated in the tasks: conceptualization and writing-review and editing. L.C. participated in the tasks: conceptualization, visualization, and writing-review and editing. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
The authors thank the Defense University Center at the Spanish Naval Academy (CUD-ENM) for all the support provided for this research.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: SDP semidefinite programming NN neural network PBIT passive beamforming and information transfer SER symbol-error-rate CNN convolutional neural network SAA sample average approximation RL reinforcement learning LOS line-of-sight DC direct current ISL intelligent spectrum learning NMSE normalized mean-square error OMA orthogonal multiple access OMP orthogonal matching pursuit DDPG deep deterministic policy gradient OFDM orthogonal frequency division multiplexing DRL deep reinforcement learning CS compressive sensing TS-OMP twin-stage orthogonal matching pursuit BCD block coordinate descending MEC mobile edge computing QoS quality of service FWoL-FPHC fuzzy win or learn fast-policy hill-climbing NLOS non line-of-sight FPGA field-programmable gate array RCG Riemannian conjugate gradient TDMA time-division multiple access PIN positive-intrinsic negative DF decode-and-forward MISO multiple-input single-output SIMO single-input multiple-output UAV unmanned aerial vehicle SINR signal-to-interference-plus-noise ratio PG projected gradient SNR signal-to-noise ratio SDR semidefinite relaxation mmWave millimeter-wave